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ABSTRACT 
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•/ 

Every  scientific  community  reveals  its  shared  beliefs  and  values, 
its  great  achievements  and  persistent  problems,  and  its  current 
state-of-the-art  and  evolutionary  future  through  its  model-building 
activity  and  its  scientific  communications.  To  survey  and 
comprehend  the  field  of  actuarial  science,  then,  one  must  examine, 
classify,  and  comment  upon  the  basic  paradigms  —  the  accepted 
concepts-models-puzzles-solutions  —  that  are  revealed  in  the 
literature  of  risk  and  insurance  theory. 

This  paper  attempts  such  a  survey,  based  upon  the  communications 
submitted  to  Topic  1  of  the  21st  ICA,  ^Generalized  Models  of  the 
Insurance  Business,*  and  upon  a  selected  portion  of  the  explosive 
numbers  of  papers  and  books  which  have  appeared  in  the  open 
literature  in  the  last  decade.  Clearly,  many  of  the  basic 
paradigms  are  still  in  a  state  of  flux,  as  new  ideas  and  methods, 
in  many  cases  imported  from  other  disciplines,  have  begun  to 
compete  with  previously  accepted  paradigms,  causing,  in  some 
cases ,  displacements  of  methodology  and  minor  revolutions  in 
conceptual  approach. 

In  addition  to  a  genlral^ model  survey,  the  paper  considers  the 
influence  of  ideas  from  other  scientific  disciplines,  and  then 
examines  in  detail  two  specific  areas  where  traditional  modelling 
has  been  called  into  question  —  the  classical  approach  to 
statistical  estimation  and  prediction,  and  the  fair-premium 
approach  to  risk  classification.  In  conclusion,  some  inferences 
about  the  near-term  future  of  insurance  modelling  are  made. 


INTRODUCTION 


It  is  a  distinct  privilege  and  a  personal  pleasure  to  address 
the  21st  ICA  on  the  topic  of  "Generalized  Models  of  the  Insurance 
Business".  In  addition  to  thanking  the  Organizing  Committee  for 
their  invitation,  I  would  also  like  to  acknowledge  the  role  of 
the  U.S.  Section  of  the  International  Actuarial  Association  in 
graciously  proposing  my  name  for  membership  in  this  international 
scientific  community?  however,  I  must  say  that  it  was  not  at  all 
clear  to  me  that  the  by-laws  require  that  every  new  member  must 
present  a  communication  at  his  first  International  Congress! 

As  I  read  through  the  42  papers  offered  on  this  topic,  I 
became  somewhat  uneasy  at  the  variety  and  divergence  of  the 
concepts  and  models  presented,  and  this  feeling  was  only 
heightened  when  I  began  to  reflect  upon  the  many  and  varied  papers 
and  textbooks  in  insurance  and  risk  theory  which  have  appeared 
in  the  past  decade.  To  comprehend  all  of  this  is,  as  we  say  in 
English,  "like  trying  to  take  a  drink  of  water  out  of  a  firehose." 
What  could  a  physicist-engineer-operations  researcher  who  has  not 
had  extensive  actuarial  practice  hope  to  add  to  this  topic,  which 
is  central  to  the  profession? 

Nevertheless,  model-building  is  an  activity  which  is  common 
to  all  scientific  disciplines,  and  upon  which  scientific 
historians  and  philosophers  have  had  much  to  say.  I  resolved, 
therefore,  to  first  of  all  set  out  a  philosophical  framework  — 
a  model  of  model-building  —  in  which  we  could  begin  to  understand 
this  recent  explosion  of  activity  in  insurance  model-building. 
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SOME  PHILOSOPHY  ON  MODELS  AND  MODEL-BUILDING 
Basic  Characteristics 

All  of  you  are,  I  am  sure,  familiar  with  the  means  and  ends 
peculiar  to  what  we  call  the  scientific  approach  [B47,  W4], 

This  complex  of  shared  values  and  attitudes  towards  problem-posing 
and  problem-solving  is  so  ingrained  in  our  everyday  work  and 
communication  with  our  colleagues  that  it  is  difficult  to  explain 
to  the  layman  exactly  what  we  mean  by  hypothesizing,  experimentation, 
measurement,  model  construction,  calibration  and  validation,  and 
implementation . 

The  communication  problem  becomes  even  more  difficult  when 
we  attempt  to  define  what  we  mean  by  a  model ,  for  the  term  is  used 
in  bewildering  variety  of  ways  by  the  scientific  community, 
ranging  from  material  representations  of  phenomena  through  factual 
interpretations  of  real-world  behavior  to  purely  symbolic 
idealizations  of  abstract  theories  [B47] .  For  our  purposes,  we 
can,  however  be  more  pragmatic,  and  attempt  a  definition  as 
follows ; 

A  model  is  a  set  of  verifiable  mathematical  relationships 
or  logical  procedures  which  is  used  to  represent  observed, 
measurable  real-world  phenomena,  to  communicate  alternative 
hypotheses  about  the  causes  of  the  phenomena,  and  to  predict 
future  behavior  of  the  phenomena  for  the  purposes  of 
decision-making . 

We  might  call  this  an  operational  definition,  for  I  have  purposely 
excluded  certain  gedanken  exercises  by  focusing  on  observable, 
measurable  phenomena,  and  insisting  that  the  ultimate  goal  of 
model -building  is  either  as  a  tool  for  communicating  with  other 
scientists  and  society  at  large  about  the  nature  of  the 
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phenomena,  or  for  predicting  and  making  decisions  about  the  future 
behavior  of  the  phenomena  —  which  in  our  case  are  all  the  risky 
contingencies  associated  with  life,  death,  and  the  loss  of  economic 
property  on  this  Earth.  I  hope  you  agree  with  this  starting  point. 

Paradigms,  Puzzles,  Communications,  and  Revolutions 

But  to  gain  a  better  perspective  upon  current  activity  in 
models  and  model-building  in  actuarial  research  and  insurance 
practice,  we  shall  need  a  broader  Weltanschauung  than  a  mere 
description  of  model  characteristics  and  model-building  activities; 
I  have  adopted  the  title  and  organization  of  this  paper  from  the 
seminal  ideas  put  forward  by  Thomas  S.  Kuhn  in  The  Structure  of 
Scientific  Revolutions  [K7] . 

To  paraphrase  broadly,  Kuhn  defines  a  paradigm  as  those 
universally  accepted  scientific  achievements,  concrete  concepts- 
models-puzzles-solutions-examples ,  that  for  a  time  provide  model 
solutions  for  a  community  of  scientific  professionals;  more 
generally,  it  can  stand  as  a  banner  for  the  entire  constellation 
of  symbolic  generalizations,  shared  beliefs,  judgement  values, 
techniques,  and  so  on  shared  by  the  community  —  for  example, 
those  ideas,  concepts,  models,  and  solutions  shared  by  the 
actuaries  of  the  world.  When  such  a  paradigm  is  universally 
accepted,  then  the  researchers  in  this  community  are  free  to 
engage  in  normal  science,  that  is,  the  highly-directed,  paradigm- 
based  theoretical,  experimental,  and  empirical  investigations 
which  provide  the  elaboration  and  mop-up  work  needed  to  apply  the 
paradigm  in  the  businesses  and  general  society  which  support  the 
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community. 

But,  in  addition  to  the  " bread- and-butter"  work  permitted 
by  normal  science,  it  must  be  recognized  that  these  accepted 
scientific  laws,  models,  and  concepts  are  inherently  self-limiting 
in  guiding  the  theoretical  activity  of  the  profession,  since  the 
paradigm  itself  provides  the  criteria  for  choosing  future  research 
areas,  which  can  only  be  assumed  to  possess  solutions  of  an 
accepted  and  understood  nature.  In  other  words,  normal  science 

"...seems  an  attempt  to  force  nature  into  the  preformed 
and  relatively  inflexible  box  that  the  paradigm  supplies. 

No  part  of  the  aim  of  normal  science  is  to  call  forth  new 
sorts  of  phenomena;  indeed  those  that  will  not  fit  the 
box  are  often  not  seen  at  all.  Nor  do  scientists  normally 
aim  to  invent  new  theories,  and  they  are  often  intolerant 
of  those  invented  by  others.  Instead,  normal  scientific 
research  is  directed  to  the  articulation  of  those  phenomena 
and  theories  that  the  paradigm  already  supplies."  [K7,  p.24], 

* 

But  how,  then,  is  science  to  make  progress  and  overcome  this 
inherent  limitation?  Kuhn  believes  that,  at  first,  progress 
occurs  precisely  because  of  this  restriction  of  focus  and  attention 
to  detail,  which  means  that,  as  scientists  increasingly  satisfy 
the  needs  of  the  society  which  pays  the  bill,  they  will  tend  to 
turn  their  research  attention  to  puzzles ,  "that  special  category 
of  problems  that  can  serve  to  test  ingenuity  or  skill  in  solution." 
Increasingly,  communications  with  other  research  workers  become 
more  esoteric  and  inaccesible  to  the  general  public,  research 
monographs  and  conferences  are  devoted  to  single  mathematical 
puzzles,  and  there  is  increasing  tension  between  the  theoreticians 
and  the  practitioners,  who  feel  that  there  are  important  real 
problems  still  unsolved,  but  find  it  more  and  more  difficult  to 
communicate  with  the  specialists.  Nevertheless,  the  normal  science 
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seems  to  be  progressing  more  and  more  rapidly  and  predictably, 
as  judged  by  the  shared  paradigm. 

Then,  something  begins  to  happen  which  provokes  a  crisis 
within  the  profession.  In  the  physical  sciences,  this  might  be 
anomalies  in  experimental  data  which  cannot  be  explained  away, 
or  the  empirical  discovery  of  a  completely  new  phenomenon  not 
covered  by  the  old  paradigm.  I  believe  the  correct  analogy  in 
actuarial  science,  which  is  governed  by  the  laws  of  economics  and 
the  marketplace,  is  that  some  new  phenomenon,  such  as  hyper¬ 
inflation,  changing  living  habits,  or  the  application  of  novel 
technology  begins  to  affect  our  collective  social  and  economic 
behavior,  which  in  turn  contradicts  the  assumptions  behind 
traditional  insurance  products.  Or,  the  occurrence  of  certain 
natural  phenomena,  such  as  earthquakes,  transportation  disasters, 
sickness  and  disease,  affects  insurance  statistics  through  the 
ways  in  which  man  and  nation  attempt  to  adapt  and  organize 
themselves  to  combat  these  disasters. 

At  first,  the  reaction  to  these  crises  is  simply  increased 
activity  within  the  old  paradigm,  as  attempts  are  made  to  study 
the  anomaly  and  to  patch  up  those  methods  and  models  which  worked 
so  well  in  the  past.  But  at  some  point,  the  difficulty  in  the 
paradigm-nature  fit  will  not  be  able  to  be  set  right  by  the 
traditional  processes,  and,  precisely  because  of  the  excellent, 
specialized  communication  network  between  specialists: 

"...the  anomaly  itself  now  comes  to  be  generally  recognized 
as  such  by  the  profession.  More  and  more  attention  is  devoted 
to  it  by  more  and  more  of  the  field's  most  eminent  men.  if  it 
still  continues  to  resist. . .many  of  them  may  come  to  view  its 
resolution  as  the  important  subject  matter  of  the  field" 

[K7,  p. 82] . 


Many  divergent  partial  solutions  will  be  attempted,  and  specialists 
from  neighboring  disciplines  will  try  their  hand  at  resolving  the 
anomaly  through  the  introduction  of  other  points  of  view  and 
methodologies.  Corporate  management,  regulators,  and  legislators 
will  also  try  to  resolve  matters  directly  through  their  powers, 
rather  than  waiting  for  the  community  to  resolve  the  anomaly. 
Through  this  proliferation  of  ad  hoc  adjustments,  the  rules 
governing  the  paradigm  will  become  increasingly  blurred,  practi¬ 
tioners  may  begin  to  disagree  on  the  nature  and  basic  hypotheses 
of  the  field,  and  shared  standards  of  value  and  judgement  may  be 
called  into  question. 

Then  finally  occurs  what  Kuhn  calls  a  scientific  revolution  — 
the  appearance  of  a  competing  paradigm  which  begins  to  accumulate 
a  weight  of  evidence  and  coherence  and  to  attract  an  increasing 
number  of  disciples  and  camp-followers  —  especially  if  it  satis¬ 
factorily  resolves  the  pressing  anomaly  and  provides  useful  guides 
to  action  by  the  practitioners  who  pay  the  bills  and  the  regulators 
and  legislators  who  answer  to  the  general  public.  But  this 
revolutionary  process  may  proceed  slowly,  for  it  often  requires 
an  important,  discontinuous  shift  in  world-view  in  the  scientific 
community.  Some  practitioners  are  forever  resistant,  because 
lifelong,  productive  careers  and  reputations  commit  them  to  an 
older  tradition  of  normal  science.  And  often,  the  arguments  which 
are  most  convincing  in  favor  of  the  new  paradigm  are  not  easily 
explained  in  the  old  terminology.  "In  a  sense  that  I  am  unable 
to  explicate  further,  the  proponents  of  competing  paradigms 
practice  their  trades  in  different  worlds"  [K7,  p.150]. 

Evolutionary  progress  according  to  Kuhn,  then,  must  occur 
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through  a  series  of  discontinuous  steps:  the  formulation  of  a 
successful  paradigm  and  the  development  of  the  professional 
community  which  shares  that  paradigm;  the  solution  of  a  large 
variety  of  practical  problems  which  establishes  the  discipline 
and  leads  to  an  active  phase  of  normal  science;  then,  a  move¬ 
ment  towards  more  and  more  "purposeless"  puzzle-solving  and 
increasingly  specialized  and  esoteric  communication;  followed, 
sooner  or  later,  by  an  anomaly  in  theory  or  an  application  disaster 
which  forces  a  crisis  upon  the  profession.  The  resolution  of 
this  crisis  requires  the  appearance,  testing,  and  acceptance  of 
a  competing  paradigm  which  more  successfully  solves  the  problem 
at  hand.  But  because  this  scientific  revolution  causes  a  dramatic 
shift  in  values  and  concepts,  its  further  growth  and  influence 
cannot  be  predicted,  but  only  be  discussed,  tested,  and  applied 
by  the  reformed  community,  which  must  adapt  to  survive. 

A  Personal  View 

My  personal  view,  if  you  will  permit,  is  that  something  like 
this  progression  described  by  Kuhn  is,  in  fact,  now  occurring  in 
insurance  modelling.  Perhaps  revolution  is  too  strong  a  term. 
Nevertheless,  I  hope  to  convince  you  that,  as  revealed  by  your 
own  communications,  we  are  at  a  very  interesting  epoch  in  the 
history  of  actuarial  science  —  on  the  one  hand,  there  is  a 
fruitful  and  prosperous  synthesis  between  and  multiplication  of 
basic  paradigms  which  are  shared  by  all,  and  yet,  at  the  same 
time,  there  is  a  progression  towards  more  and  more  academic 
puzzle-solving,  together  with  disquieting  news  from  the  real-world- 
application  front  line. 
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Let  us  begin  by  considering  those  risk  and  insurance  business 
paradigms  upon  which  we  all  agree  and  surveying  their  current 
development,  with  a  few  remarks  on  their  strengths  and  weaknesses. 
I  will  next  comment  upon  the  introduction  of  new  points  of  view 
from  other  scientific  disciplines,  and  then  consider  in  detail 
two  specific  areas  where  a  crisis  is  in  progress  —  statistical 
estimation  and  prediction,  and  risk  classification. 

In  selecting  additional  references  to  demonstrate  the  variety 
and  growth  of  the  field,  I  have  rather  arbitrarily  limited  myself 
to  articles  which  have  appeared  within  the  last  five  years,  or 
which  seemed  to  be  particularly  useful.  And  because  many 
national  society  journals  were  not  available  to  me,  these  contri¬ 
butions  are  also  underrepresented.  My  apologies  to  colleagues 
who  find  their  favorite  references  missing. 
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BASIC  RISK  PARADIGMS 


General  Characteristics 


Traditionally,  the  basic  models  of  risk  are  divided  into 
two  distinct  classes:  those  used  in  life  insurance  companies, 
and  those  which  arise  in  non-life  applications.  This  class 
distinction  has  been  slowly  vanishing,  as  new  risk  coverages  have 
demanded  a  combination  of  the  two  approaches,  and  as  work  on 
more  easily  shared  insurance  business  models  has  progressed. 

As  stated  by  Professors  Amsler  [80.1],  Buhlmann  [76.20],  and 
Franckx  [76.21],  and  many  others,  there  seems  now  to  be  general 
agreement  that  the  basic  mathematical  models  common  to  all 
branches  of  insurance  have  three  key  elements: 


(1)  One  or  more  random  variables  which  characterize  the 
major  dimensions  of  the  risk,  such  as  duration,  size, 
and  number; 

(2)  A  set  of  well-defined  states  of  nature ,  separated  by 
observable  transition  events  or  epochs ,  together  with 
a  deterministic  or  stochastic  law  of  motion  between 
the  states; 

(3)  An  economic  function,  associated  with  the  underlying 
random  variables  and/or  the  states  and  transition 
events,  which  may  also  be  deterministic  or  random,  but 
is  most  often  linked  to  uncontrollable  economic  exter¬ 
nalities,  such  as  market  growth,  inflation,  currency 
risk,  etc. ,  but  also  to  economic  performance  under  the 
control  of  the  company,  such  as  profit  margin,  portfolio 
performance,  etc. 


Even  though  one  or  more  of  these  elements  may  appear  to  be 
missing  in  the  simplest  models,  it  is  usually  merely  suppressed 
by  long- standing  convention,  hypothesis,  or  for  reasons  of 
simplicity.  As  one  begins  to  construct  more  and  more  complex 
models  of  actual  insurance  operations,  or  to  build  large-scale 
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simulations,  then  all  of  these  factors  begin  to  come  into  play  — 
indeed,  one  is  often  forced  to  synthesize  and  orchestrate  a 
number  of  simpler,  specialized  risk  models. 

To  illustrate  this  rather  philosophical  point,  let  us  consider 
some  of  the  basic  risk  paradigms  currently  in  use.  Some  of  the 
results  expressed  in  this  Section  may  seem  like  "old  wine  in  new 
bottles,"  but  I  hope  that  the  new  bottles  and  labels  will  help 
you  in  reorganizing  your  wine  cellar,  and  eliminating  vintages 
which  are  past  their  prime  years. 

We  shall  use  tildes  to  denote  randan  variables ,  and  E  and 
1/  to  denote  the  expectation  and  variance  operators,  respectively. 
In  other  words,  x  is  a  random  variable  with  observed  value  x  , 
and  Ex  and  l/x  are  its  mean  and  variance. 

Life  Contingencies 

In  life  assurances,  the  state  space  is  extremely  simple  — 
a  person  is  living,  and  then  after  the  unique  event,  death,  makes 
a  transition  to  deceased  status;  the  basic  random  variable  of 
interest  is  Tx  ,  the  remaining  lifetime  from  moment  of  under¬ 
writing  until  death,  given  the  information  (x)  ,  usually  the 
individual's  age,  sex,  health,  etc.  The  probability  distribution 
of  Tx  is  usually  given  by  an  empirically  observed  mortality 
table,  although  certain  analytic  laws  are  sometimes  explicated, 
calibrated,  and  used.  Often,  the  remaining  life  of  an  individual 
insured  at  age  x  ,  but  now  aged  x+t  ,  given  that  he  is  still 
alive,  is  assumed  to  be  given  by  the  same  ("non-select")  table 
or  law.  Life  companies  worry  a  lot  about  receiving  business  from 
individuals  who  may  know  more  about  their  own  personal  mortality 
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than  that  reflected  in  this  general  law. 

There  are  two  primary  cash  flows  of  interest:  $1  lump  Siam 
payable  at  time  t  ,  and  a  hypothetical  stream  of  $1  per  year, 
payable  continuously  from  time  zero  to  time  t  .  The  economic 
functions  of  interest  are  the  associated  present  values  at  the 
origin.  If  the  force  of  interest  is  5  per  year,  then  we  find 
easily  the  present  value  of  the  lump  sum  as  A(t)  =  exp(-6t)  , 
and  the  present  value  of  the  continuous  annuity  as  a(t)  = 

6-1[l  -  exp(-6t)]  .  These  functions  and  their  present  values 
can  be  moved  forward  h  time  units  by  multiplying  by  the  "shift 
operator”,  exp(5h)  ;  different  fixed  "face  amounts"  of  F$,  or 
f$/year  just  multiply  A  and  a  ,  respectively;  more  complex 
combinations  of  cash  flows  follow  by  superposition,  or  super¬ 
position  and  shifting,  or  convolution.  For  instance,  the  present 
value  of  an  annuity  of  $l/year  from  epoch  t^  to  t2  is  simply 
a(ti,t2)  =  a(t2)  -  a(t^)  ;  a  perpetuity  beginning  at  t^  has 
value  a(t^,«°)  =  6  ^  -  a(t^)  =  <$-^A(t^)  ?  etc.  Even  a  variable- 
face  continuous  annuity  of  amount  f(t)  $/year  at  time 
t(0  £  t  £  °°)  has  present  value  /e”5tf(t)  dt  =  /A(t)f(t)  dt  , 
from  which  integration  by  parts  gives  F(0+)  +  6/A(t)F(t)  dt  t 
with  F (t)  the  integral  of  f(t)  ;  from  this  "transform 

calculus"  a  whole  new  variety  of  forms  and  interpretations  can 
be  gotten.  But  that's  essentially  all  there  is  to  economic  forms 
in  life  assurance. 

The  link  to  the  underlying  risk  process  is  through  the 
durations  or  epochs  of  these  forms  which  are  now  random  variables 
like  Tx  .  For  instance,  the  present  value  of  $1  paid  at  the  death 
of  (x)  is  simply  the  transformed  random  variable  ACT^)  »  and 


the  present  value  of  $l/year  payable  until  the  death  of  (x)  is 
the  transformed  random  variable  a (T  )  *  5  ^[1  -  A(T  )]  .  Making 

X  X 

the  necessary  transformation  between  a  given  analytic  form  of  the 
density  for  Tx  to  the  densities  of  these  new  values,  and  calcu¬ 
lating  the  moments,  etc.,  is  a  standard  exercise  in  a  first-year 
course  in  probability. 

In  fact,  as  has  been  forcibly  stated  many  times  [G8,  Gil,  H17 , 
T13] ,  the  fair  premiums  which  are  of  interest  to  the  actuary  are 
nothing  more  than  the  mean  values  of  these  random  variables;  in 
classical  notation: 

K  -  EA(v  >  5x  *  Ea(v  • 

Note  that  this  approach  to  fair  premiums  is  independent  of  whether 
Tx  is  continuous  or  quantized  to  end-of-year  deaths,  and  that 
periodic  rather  than  continuous  payments  will  be  reflected  in  the 
choice  of  economic  function,  not  necessarily  in  the  range  of  the 
random  variable.  Further,  many  of  the  so-called  "theorems" 
relating  different  fair  premiums  are  trivial  in  this  context,  and 
(see  above)  may  often  hold  for  the  random  variable  as  well  as  for 
its  mean  value. 

Multilife  contingencies  are  also  simpler  from  the  point  of 
view  of  the  random  variable.  Consider,  for  example,  the  reversion¬ 
ary  continuous  annuity  to  (x)  after  (y)  ,  and  suppose  we  know 

the  joint  density  of  the  two  random  remaining  lifetimes,  call  them 
^  *** 

T  and  T  .  If  we  define  a  new  random  variable,  T  =  min(T  ,T  ) 
x  y  xy  x  y 

then  by  direct  arguments,  this  new  economic  function  is  nothing 
more  than  the  present- value  of  a  continuous  annuity  paid  from 


13 

Txy  to  Tx  ,  i.e.,  a(Txy,Tx)  =  a(Tx)  -  a(Txy)  .  Not  only  is 
the  usual  magical  identity  true  for  the  random  discounted  value 
as  well  as  the  mean,  it  can  be  seen  immediately  that  the  assump¬ 
tion  of  independence  of  the  two  lives  is  not  needed,  provided 

simply  that  the  marginal  density  of  T  is  correctly  calculated. 

xy 

In  an  obvious  analogy  to  reliability  engineering,  more  compli¬ 
cated  multi-life  contingencies  are  easily  handled  through 
Boolean  logic,  and  associated  min  and  max  operators. 

Another  advantage  of  modelling  directly  with  the  random 
variables  is  that  variances  and  other  moments  can  easily  be 
obtained,  Hattendorf's  theorem  proven,  prospective  reserves 
calculated,  and  even  full  distributions  of  outcomes  obtained 
(see  (G8 ,  Hll,  T13]  and  references  therein). 

The  reason  that  I  have  spent  so  long  on  this  elementary 
modelling  is,  first  of  all,  to  convince  you  that  a  major  educa¬ 
tional  overhaul  is  overdue  in  this  area.  I  regularly  teach  the 
material  covered  in  life  contingency  textbooks  to  (post) graduate 
engineering  and  statistics  students  in  about  six  hours  of  lecture; 
we  are  then  free  to  analyze  variances  and  reserves  (such  as  the 
adaptive  reserve  model  described  in  the  Section  on  Bayesian 
Statistics) ,  do  mortality  estimation,  or  to  proceed  to  casualty 
risk  models.  I  admit  that,  at  the  undergraduate  level,  where  a 
student  may  have  had  only  one  course  in  probability  theory  with 
little  application,  it  might  take  longer,  say,  thirty  hours, 
to  introduce  the  basic  principles  of  life  contingencies;  but  by 
using  random  variable  concepts  and  methods  with  which  the  student 
is  already  familiar,  it  is  possible  to  move  quickly  into  more 
"advanced"  topics,.  In  my  opinion,  there  is  a  mismatch 
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between  the  formation  and  the  capabilities  of  today's  students,  and 
the  demands  which  are  placed  upon  them  using  the  traditional 
expected-value  models.  Furthermore,  by  not  properly  laying  down 
the  fundamental  concepts  of  random  variables  and  the  basic  model 
assumptions,  a  potential  for  future  change  in  actuarial  modelling 
is  being  wasted  —  for  example,  in  developing  newer  models  for 
equity-linked  assurances,  pension  and  health-care  applications, 
or  in  strengthening  corporate  modelling  and  simulation. 

Secondly,  I  suppose  I  must  say  something  about  the  archaic 
notation  with  which  the  life  actuary  is  burdened,  and  which  is 
the  subject  of  continued,  in  my  opinion  rather  pointless,  dis¬ 
cussion  [76.22,  B22] .  In  comparison  with  other  scientific  commun¬ 
ities,  it  seems  as  if  actuaries  lead  the  way  in  insisting  that  all 
of  the  hypotheses  about  the  model,  not  just  the  features  of 
interest,  be  hung  as  "bells  and  whistles"  on  the  underlying 
variable,  as  in 

n|max  or  00  (n:m)  a(x)  (i;T) 

Although  this  approach  may  guarantee  full-employment  for  type¬ 
setters,  it  does  not  help  in  actual  computations  (where  the 
parameters  would  be  passed  to  the  computation  subroutine  by 
global  variables,  a  standardized  calling  sequence,  a  data  block, 
etc.),  and  it  definitely  impedes  scientific  communication  outside 
of  actuarial  circles.  Being  rigid  about  notation  for  mean  values, 

and  requiring  that  the  density  for  T  always  be  written  _p  y  , _  , 

X  T  X  X+T  # 

etc.  makes  the  problem  of  defining  efficient,  easily  recognized 

random  variable  notation  more  difficult,  as  examination  of  some  * 

of  the  earlier  references  will  reveal. 


Ayfuttiiariiiiiia 


Finally,  from  a  modelling  point  of  view,  focusing  on  the 
mean  values  of  discounted  random  cash  flows  can  obscure  the 
fundamental  hypotheses  being  made,  lead  to  erroneous  interpreta¬ 
tions  and  conclusions,  may  give  incorrect  or  misleading  financial 
projections,  and  will  impede  the  construction  of  more  general, 
meaningful  models  —  in  short,  "normal  science"  at  its  worst. 
Because  this  point  is  so  important,  I  would  like  to  give  a 
simplified  example  using  random  variables. 

Suppose  we  are  trying  to  find  the  fair  level  continuous 
premium,  R  $/year,  to  charge  (x)  for  a  life  assurance  of  $1, 
payable  at  death.  There  are  two  random  cash  flows  involved  —  the 
receipt  of  premium  income  until  death,  and  the  payment  for  benefits 
provided  at  death.  Their  difference  —  the  underwriting  gain  to 
the  company  —  is  also  a  time-dependent  random  cash  flow.  Suppose 
we  let  G (0)  be  the  present  value,  at  the  moment  of  underwriting, 
of  this  random  future  gain.  Using  random  variables  introduced 
previously,  we  obtain: 


G(0)  =  nxa(Tx)  -  l*A(Tx) 


Now,  to  find  the  premium  rate,  we  must  invoke  a  new  modelling 
assumption  —  the  equivalence  principle ,  which  states  that  we 
define  a  random  cash  flow  to  be  "fair"  if  its  mean  value  is  zero 
tand  note  that  even  this  assumption  may  be  modified  if  we  adopt 
the  viewpoint  of  utility  theory,  see  later) .  This  new  hypothesis 
that  EG(0)  *  0  is  what  furnishes  the  classical  result  JIx  = 
Ax/ax  .  But  now  we  find  directly  other  results,  such  as  the 
variance  of  this  random  gain: 
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V{G(0)  }  -  [l  +  £c]  V  {e*"^Tx} 


which  is  not  always  significant  for  human  mortality,  but  adds  a 
considerable  risk  in  engineering-economic  equipment  replacement 
studies. 

As  time  t  passes,  and  the  individual  has  not  yet  expired, 
the  random  current  value  grows  to  G(t)  *  e  G(0)  .  A  priori , 
this  new  random  variable  has  zero  mean  for  a  fair  premium  rate, 
but,  at  time  t  ,  we  will  know  that  death  has  not  yet  occurred, 
and  the  distribution  of  the  random  remaining  lifetime,  Tx+t  , 
will  have  to  be  recalculated,  according  to  ^x+t  *  “  tl^x  > 

Using  basic  definitions  and  rearranging  terms,  we  find  the  current 
value  of  random  underwriting  gain  at  time  t  ,  given  that  (x)  is 
still  alive,  to  be*. 


G(t)  =  nx<S_1  jexp(5t)  -  l]  -  [l •  A 


'W  -  V(w] 


We  recognize  the  first  term  as  the  accumulated  "sure-thing"  premium 
income;  the  second  term  is  the  random  future  net  liability  to  the 
company,  discounted  back  to  epoch  t  —  indeed,  it  is  immediate 
that  the  mean  value  of  this  discounted  liability  is  just  the  legal 
level-premium  reserve ,  .  So  far  so  good. 

But  now,  having  "led  you  down  the  garden  path"  with  familiar 
results,  slightly  generalized,  let  us  calculate  an  unfamiliar 
variable  —  the  random  amount  of  gain  to  the  company  at  the  moment 
of  expiration ,  which  must  be 

G(TX)  =  Hx5"1[exp(5Tx)  -  i]  -  1  . 


A  priori,  this  amount  is  not  zero  in  expectation  —  indeed,  a  little 
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calculation  will  show  that  for  small  forces  of  interest: 

E{G(Tx)}=  51/{Tx}/E{Tx}  >  0  (6-0)  , 

or  for  usual  interest  rates  and  standard  mortality  tables, 
about  5-10%  of  the  face  value,  on  the  average,  is  "left  on  the 
table"  for  the  company  at  expiration!  This  unexpected  result 
cannot  even  be  developed  through  the  use  of  classical  expected- 
value  notation.  And,  I  submit,  explaining  why  this  result  does 
not  affect  the  profitability  of  the' company  is  a  valuable  exercise 
for  the  student  in  understanding  the  basic  models  of  life  contin¬ 
gencies. 

Health  and  Pension  State  Models 

The  element  of  a  state  variable  is  especially  important  when 
we  progress  to  the  more  complex  models  of  health  and  pension 
insurances.  As  shown  in  Figure  1,  we  imagine  that  the  covered 
individual  moves  from  one  state  to  another  at  certain  random 
transition  epochs ,  and  that  these  states  define  different  coverages 
or  benefits  and  different  mortality  laws.  In  health  insurance, 
these  states  are  the  observable  differences  in  health,  conditions 
of  heart,  limbs,  or  teeth,  progress  of  a  disease,  marital  and 
family  status,  parturition,  etc.;  in  pensions,  the  states  can  refer 
to  active  or  disability  status,  eligibility  and  participation 
status,  wage  category,  employment  and  participation  history,  vest¬ 
ing,  retirement  status,  and  so  on.  In  most  cases,  death  is  the 
terminal  state,  although  it  is  often  important  to  condition  on 
the  previous  status  or  history. 

The  traditional  way  of  describing  the  associated  laws  of  . 


motion  is  through  multiple  decrement  theory ,  that  is,  a  failure- 
rate-oriented  competing-risk  approach  (see,  e.g.  [Hll] )  .  While 
this  approach  is  useful  for  simple  problems  because  it  can  be 
explained  in  terms  of  related  single-decrement  tables,  it  supposes 
that  the  sequence  of  states  is  in  extended  (tree-like)  form.  In 
more  complex  models,  it  is  often  convenient  to  assume  (like  the 
dotted  line  in  Figure  1)  that  certain  states  can  be  recurrent. 
Then,  the  more  natural  formulation  is  in  terms  of  Markov  chains 
((Cl,  H14],  and  the  references  in  [H17]).  With  arbitrary  contin¬ 
uous  duration  random  variables,  the  model  is,  in  fact,  a  Markov- 
renewal  (or  semi -Markov)  process,  whose  transition  and  duration 
laws  can  be  easily  related  to  an  equivalent  competing-risks 
formulation.  Economic  functions  can  be  tied  to  both  the  transi¬ 
tions  and  to  the  durations  in  certain  states,  but,  of  course, 
making  certain  states  recurrent  implies  equivalence  of  economic 
sequences  after  re-entry. 

If  the  underlying  duration  law  is  discrete,  as  in  transitions 
only  at  the  end  of  the  year,  or  on  birthdays,  etc.,  then  the 
Markov  chain  approach  can  have  the  state  space  expanded  to  include 
age,  years  of  service,  etc.,  and  the  only  law  of  interest  becomes 
a  rather  large,  special-structure  set  of  transition  probabilities 
determined  from  the  associated  decrement  tables  [B2,  H15] .  The 
advantage  of  this  approach  is  that  the  only  practical  calculations 
involved  are  the  multiplications  of  large  matrices,  which  can  be 
easily  carried  out  on  computers.  Numerous  other  insurance  calcu¬ 
lations  [80.26]  can  be  similarly  simplified,  and  there  are  many 
related  models  in  the  fields  of  demography,  sociology,  educational 
planning,  manpower  planning,  and  so  on. 
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With  such  a  natural  and  well-understood  paradigm,  it  is 
surprising  to  see  that  most  of  today's  pension  mathematics  is 
still  oriented  towards  certainty  equivalents.  Admittedly,  the 
large  numbers  of  variables  and  laws  makes  analytic  manipulations 
rather  formidable;  even  recent  dynamic  models  [B35,  30.48,  76.5] 
require  rather  heroic  mathematics  to  get  expected-value  results. 

But  there  are  other  studies  of  more  specialized  topics  [B2,  Sll, 
S12,  Sl3l  which  are  beginning  to  apply  a  complete  random-variable 
analysis  to  pension  problems,  which,  of  course,  is  essential  to 
answering  questions*  such  as  the  probability  that  a  given  method 
of  funding  will  be  adequate;  this  calculation  need  not  be  analytic, 
but  can  be  easily  explored  with  modern  computers.  I  expect  to 
see  a  rather  substantial  evolution  of  stochastic  pension  models 
over  the  next  few  years. 

The  field  of  health  insurance  seems  to  be  currently  oriented 
towards  data-gathering  and  empirical  investigations,  rather  than 
formal  modelling  —  in  part,  because  of  the  explosive  growth  in 
the  technology  and  costs  of  health-care  delivery  (see  Topic  3  of 
this  Congress) .  So  I  believe  it  will  be  many  more  years  before 
one  can  give  a  balanced  survey  of  this  field.  [J3]  mentions  a 
qualitative  state-space  model  for  monitoring  the  rehabilitation 
process  in  worker's  compensation  claims,  and  there  are  similar 
operations-research -oriented  models  in  the  health  care  field. 

Accumulations  of  Risk 

The  basic  and  most  successful  casualty  insurance  paradigm  is 
the  accumulated  claim,  or  aggregate  claim,  or  risk  model,  in  which 


two  random  variables  of  interest  are  explicity  recognized  —  the 
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random  number  of  claims  k  =  in  some  exposure  interval 

(ti^tj)  ,  and  the  random  value  of  each  claim  [x^,*^  /  • . .x^]  [B37, 

S6]  .  The  economic  function  of  interest  is  the  random  total  of 
accumulated  claims ,  s  =  ,  which  is  a  random  sum  of  rcindom 

variables : 


s  = 


X1  + 


+  xk 


the  stochastic  curve  generated  by  s  as  t2  increases,  with  t^ 
fixed,  is  sometimes  called  the  accumulated  claim  process ,  to 
distinguish  it  from  the  underlying  point  process  of  the 
claim  epochs  ,  which  generate  the  monotone  claim  counting  process , 
k  ,  as  t2  increases  (Figure  2).  Usually  two  assumptions  are  made: 

(1)  Given  k  =  k  ,  the  random  variables  [x^ ,X2 * . . .x^]  are 
mutually  independent  and  identically  distributed;  < 

(2)  Further,  their  common  distribution  does  not  depend  upon  k 


The  first  assumption  is  apparently  satisfied  or  is  a  satisfactory 
approximation  in  practice,  for  I  can  find  no  discussion  or  verifi¬ 
cation  of  this  in  the  literature.  There  are  occasionally  examples 
where  (2)  is  not  verified,  as  when  someone  who  has  a  large  number 
of  claims  turns  out  to  have  smaller  (or  larger)  claims,  on  the 
average,  than  someone  with  a  smaller  number  of  claims;  but  there 
is  another  explanation  for  this  observation,  given  below  in  the 
Section  on  Credibility  Theory. 

Assuming  both  (1)  and  (2) ,  the  first  two  moments  are  the 
important  and  familiar  relations: 

E(s}  *  E{k}  •  E(x)  ;  y{§}  =  E(k}*l/{x}  +  /{k}  [E{x}] 2  , 
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where  x  stands  for  a  generic  claim  amount.  Notice  that,  although 
these  assumptions  require  the  claim  sizes  to  be  stationary  in  time, 
the  claim  number  process  may  still  have  a  time-varying  law. 
Different  formulations,  such  as  underlying  birth-and-death, 
renewal,  or  time-varying  Poisson,  have  been  attempted,  but  as  a 
practical  matter,  it  is  difficult  to  work  with  models  in  which 
the  numbers  of  claims  in  adjacent  intervals  are  dependent.  So, 
usually,  a  stationary,  or  simple  time-varying  Poisson  assumption 
is  made.  The  resulting  distribution  of  s  ,  after  a  possible 
operational  time  transformation,  is  the  familiar  mixture  of 
convolutions  of  the  distribution  of  x  . 

Surprisingly,  in  spite  of  the  Central  Limit  Theorem  and  the 
Poisson  assumption,  it  turns  out  to  be  difficult  to  get  exact 
results  or  good  approximations  to  the  total  claim  distribution  in 
the  case  of  general  individual  claim  distributions  [B7,  S6]  . 

As  the  controversy  over  which  is  the  best  approximation  method  is 
still  going  on,  I  will  close  simply  with  the  remark  that  all 
paradigms,  including  approximations,  should  be  judged  in  terms  of 
their  usefulness  for  making  decisions,  or  in  communicating  with 
other  scientists,  but,  as  far  as  I  can  tell,  this  has  not  yet 
happened. 

Collective  Variability 

The  next  model  was  developed  primarily  to  explicate  certain 
non-obvious  variations  in  observed  outcomes  from  a  portfolio ,  or 
collective  of  risks;  however,  it  has,  as  we  shall  see,  profound 
implications  for  model-building  and  for  estimation. 

Consider  that,  for  a  single  risk  contract,  we  are  measuring 
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some  outcome  random  variable,  x  ,  and  that  the  distribution  density 
of  x  ,  p(x|9)  ,  depends  upon  one  or  more  parameters  which  we  here 
symbolize  by  0  .  Now,  if  0  were  an  observable  physical  quantity, 
then  we  can  imagine  that  it  would  be  easy  to  assemble  a  cohort  of 
M  individual  risks,  i=l,2,...M  ,  each  of  which  had  the  same 
parameter  values,  9^  =  ©2  =  • . •  =  ©M  ,  ^rom  eac^  of  which  we 

could  draw  one  sample  outcome,  x1,x2/...xM  as  M  independent, 
identically  distributed,  random  samples  from  the  same  "urn", 
p(x|0)  .  But  many  of  the  interesting  parameters  in  casualty 
insurance,  such  as  "accident-proneness",  are  difficult  to  measure, 
and  one  can  easily  imagine  that,  no  matter  how  hard  one  would  try 
to  assemble  a  homogeneous  portfolio,  there  would  be  still  factors 
which  could  not  even  be  described,  and  which  would  lead  to  unexplain¬ 
able  residual  variabilities  [A3]  . 

The  intuitive  leap  which  makes  this  a  most  useful  paradigm 
is  to  imagine  that,  in  fact,  the  abstract  parameters  0^  for  each 
risk  i=l,2,...M  ,  are  not  the  same,  but  are  given  by  selecting 
the  risk  parameters  from  some  structure  function  density,  u(0)  , 
which  describes  the  variation  of  the  0^  as  if  they  were  indepen¬ 
dent  samples  of  some  random  quantity  8  [B37]  .  This  leads  to  an 

"urn  of  urns"  interpretation,  shown  in  Figure  3,  in  which  the 
drawing  of  a  particular  random  outcome ,  x .  *  x.  ,  is  a  two-stage 
process,  in  which  a  risk  parameter  0  =  0^  is  selected  using  the 
structure  density,  u(8)  ,  and  then  xi  is  an  independent  sample 
from  the  conditional  density  p(x|0)  .  This  means  that  each  x^^ 
has  the  same  marginal  mixed  density,  p(x)  =  /p(x|0)  u(9)  d0  , 
but  more  importantly,  means  that  the  joint  density  of  several 
samples  from  the  same  risk  i  ,  [xil ,xi2 , . . .xin]  appears,  a  priori , 
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(before  knowing  9  ,  and  averaging  over  the  collective)  to  be 
dependent, 

p(xil,xi2# • • *xin)  =  /p (xil | 0)p(xi2 | 9) • • .p(xin | 6)  u ( 0 )  d0  , 

even  though,  when  the  particular  urn  value,  3.  ,  is  given,  the 
successive  samples  will  be  independent.  This  leads  to  a  dependency 
which  makes  it  possible  to  succesively  "narrow  in"  on  the  correct 
value  of  9^  ,  even  though  it  _is  unobservable ,  by  taking  a  large 
enough  number  of  samples  from  urn  i  (see  the  Section  on  Bayesian 
statistics) . 

The  practical  importance  of  the  mixing  model  is  that  it 
introduces  a  new  source  of  variability  due  to  the  variation  of  0 
in  the  collective.  If  we  let 


m(0)  *  E{xj  8 }  ?  v(Q)  =  V{xld}  ; 


be  the  first  two  moments  of  the  observable  value  x  ,  drawn  using 
P  (x  j  9 )  from  an  urn  with  a  given  0  ,  then  it  can  easily  be  shown 
that  the  result  of  making  a  draw  from  an  arbitrary  urn  in  collective 
(e.g.  from  p(x)  )  must  have  first  two  moments: 


m  =  Ex  =  Em  ( 9 )  ; 


v  ■  Vx  ■  E  +  D  7  E  =  Ev (9 )  7  D  -  Vmte)  ’  . 


The  first  result  is  expected,  since  the  overall  mean  is  simply  the 
average-over-the-collective  mean.  But  the  total  variance  has  two 
components:  the  first,  E  ,  being  the  average-over-the-collective 
of  the  individual  urn  variances,  and  the  second,  D  ,  representing 


9**  * 


the  variability-over-the-collective  of  the  mean  result  from  each 
risk.  If  the  portfolio  were  homogeneous,  this  unexpected  term 
D  would  vanish. 
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Practically,  the  mixing  of  distributions  leads  to  new  forms 
which  are  often  very  useful,  and  match  empirical  data  well.  For 
instance,  if  p(x|9)  is  Poisson  with  parameter  6  ,  and  the 
structure  density  is  Gamma  distributed,  then  the  mixed  density 
is  the  useful  Negative  Binomial. 

The  only  serious  challenge  to  this  paradigm  seems  to  be  the 
philosophical  point  that  we  perhaps  have  no  business  modelling 
with  a  quantity  which  is  unobservable,  and  may  not  follow  the 
usual  laws  governing  a  random  variable.  This  is  a  fine  point 
which  depends  upon  what  one  means  by  "random’’ .  But  if  we  believe 
that  the  model  holds,  there  are  ways  to  infer  the  law  of  9  ,  if 
the  data  set  is  large  enough,  which  we  shall  describe  in  a  later 
Section. 

Extreme  and  Dangerous  Values 

There  is  another  concept  which  is  part  of  the  model-building 
tradition  in  non-life  insurance,  but  as  yet  is  only  partly 
formed  —  the  idea  that,  no  matter  what  model  we  use,  actual  data 
will  always  contain  some  dangerous  "surprises"  that  can  lead  to 

i 

business  instability  and  ruin. 

Leaving  aside  personal  views  that  "Nature  is  always  perverse", 
we  might  first  of  all  attempt  to  describe  this  behavior  by  using 
extreme  value  theory  to  estimate  the  largest  observed  claims  and 
to  set  reinsurance  treaties  [76.9,  B19,  T15] .  See  also  [80.30, 
76.15,  Rl,  R2]  for  applications  where  extreme  values  are  part  of 
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the  risk  process. 

Another  approach  is  to  develop  justifications  and  estimation 
methods  for  long-tail  "dangerous"  distributions,  such  as  the 
Pareto  [80.44,  B14] ;  these  distributions  are  not  often  used  for 
regular  modelling  because  higher  moments  may  not  exist. 

Yet  another  approach  [76.17]  is  to  consider  that  what  is 
really  happening  is  a  mixture  of  two  models,  in  which  the  "regular" 
values  are  being  contaminated  by  the  "abnormal"  ones.  This,  of 
course,  is  related  to  the  collective  model  just  analyzed,  and 
to  the  Bayesian  problem  of  model  mixtures  (see  later) . 

There  is  also  a  well-developed  theory  of  outliers  in  the 
statistical  literature,  but  I  am  not  certain  if  the  ideas  repre¬ 
sented  above  are  simply  expressions  of  this  phenomenon,  or 
whether  the  authors  mean  that  a  more  structured,  as  yet  imprecise, 
kind  of  dangerous  risk  mechanism  is  at  work.  It  will  be  interesting 
to  see  if  this  paradigm  develops  further. 
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THE  DYNAMIC  RISK  PROCESS  PARADIGM 
Reserves  versus  Stability 

The  first  model  which  analyzed  the  effect  of  random  fluctua¬ 
tions  upon  the  risk  business  of  the  insurance  company  as  a  whole 
is  the  venerable  collective  risk  theory ,  which  to  avoid  confusion 
with  a  previous  usage,  will  be  referred  to  as  the  dynamic  risk 
process  paradigm.  If  we  imagine  that  a  company  has  developed  a 
fixed  portfolio  of  individual  risk  contracts  in  a  given  business 
line,  then  its  accumulated  net  premium  income  in  interval  (0,t) 
can  be  approximated  by  the  straight  line  Ht  ,  where  n  is  the 
total  premium  rate.  Against  this  pure  premium  income  must  be 
paid  out  the  stochastic  accumulated  claim  process ,  §(t)  =  S(0,t)  , 

described  in  the  last  section,  assumed  now  to  be  aggregated  over 
all  the  contracts  in  the  homogeneous  portfolio,  with  the  distri¬ 
bution  of  claim  amount,  X  ,  the  same  for  all  risks,  and  the 
claim  number  process  k(t)  =  k(0,t)  referring  to  the  counting  of 
all  claims  from  the  portfolio.  This  cohort  is  said  to  be  "in 
balance"  if  II  »  E{X} • E{k (t) /t)  ,  either  for  the  interval  (0,t) 
under  consideration,  or  for  t  "large  enough",  depending  upon 
the  underlying  claim  number  process. 

The  difference  between  the  two  accumulating  processes,  the 
underwriting  gain  for  the  portfolio,  will,  on  the  average  be  zero, 
but  of  couse  may  vary  widely  into  negative  and  positive  values. 

To  provide  an  element  of  stability,  management  furnishes  an  initial 
amount  of  risk  (fluctuation,  free,  technical)  reserve ,  Rq  ,  and 
then  describes  its  underwriting  results  in  terms  of  the  risk 
reserve  process,  R(t)  *  RQ  +  nt  -  s(t)  ,  shown  in  Figure  4.  The 
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Arrival  event 


FIGURE  5 


Virtual  Waiting  Time  Process  in  a  Single-Server  Queue 


steadily  increasing  value  due  to  the  net  premium  income  is 
interrupted  at  random  epochs  by  random- sized  drops  due  to  losses 
(benefits)  paid  out  to  the  policyholders.  Although  this  process 
now  fluctuates  around  Rq  ,  if  balanced,  it  clearly  is  possible 
that,  at  some  random  point  £*  ,  a  large  loss  when  the  portfolio 
is  in  an  already  precarious  position  will  cause  R(E*)  <  0  ,  a 
condition  known  as  technical  insolvency ,  or  to  use  the  more 
graphic  term,  ruin,  in  the  portfolio,  or  in  the  company.  In  its 
classical  form,  this  model  does  not  examine  the  cost  to  the 
company  of  providing  Rq  ,  nor  the  economic  consequences  of  ruin, 
but- concentrates  simply  on  the  trade-off  between  stability  and 
ruin,  that  is,  on  the  effect  of  Rq  on  the  probability  of  ruin, 
P*(t)  =  Pr  (R  <  0  for  some  0  <  t*  <  t),  or  on  the  mean  time  to 
ruin,  E{E*}  ,  as  t-«°  ,  given  that  ruin  occurs. 

This  important  model  was  pioneered  in  the  early  part  of  this 
century  by  Filip  Lundberg,  and  then  developed  by  Cramer,  Segerdahl, 
Esscher,  Ammeter,  Philipson,  Sparre-Anderson ,  and  many  other 
famous  names  in  a  principally  all-Scandinavian  school  of  actuaries 
and  academics.  It  is  impossible  to  provide  an  overview  of  this 
voluminous  field  in  this  short  space,  but  there  are  numerous 
surveys  starting  with  Cramer's  in  1955  [C5] ,  and  the  papers  which 
appeared  in  the  two  symposia  which  bear  Lundberg 's  name  [F6,  F7l 
not  to  mention  actuarial  texts  [B7,  B37,  S6]  ,  and  monographs 
[B8 ,  S10]. 

An  examination  of  the  literature  will  reveal  that  serious 
study  of  this  paradigm  continues  unabated.  For  example,  even 
though  the  basic  theory  and  methods  needed  to  compute  P*(t)  and 
other  measures  were  known  very  early,  analytical  computation  is 
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difficult  enough,  and  has  so  many  intriguing  variations  and  insights, 
that  it  has  continued  to  attract  actuarial  researchers.  See,  among 
others,  [80.31,  76.12,  76.18  ,  D14,  F7,  T2 ,  T8] .  [B23 ,  S9]  mention 

elementary  simulations,  and  there  are  no  doubt  many  investigations 
which  have  preferred  to  follow  this  route  for  specific  empirical 
assumptions . 

Stochastic  Evolution 

But  the  dynamic  risk  process  paradigm  has  been  of  value  to 
scientists  for  more  than  just  the  analysis  of  performance  measures; 
this  extraordinary  activity  has  also  given  the  basic  theory  and 
applications  of  stochastic  processes  an  important  impetus.  Indeed, 
it  would  not  be  an  exaggeration  to  say  that  many  of  today’s 
standard  topics  in  stochastic  processes  —  the  models  of  random 
walks,  stationary  point  and  increment  processes,  renewal  processes, 
Poisson  and  birth-and-death  processes,  diffusion  processes, 
martingales,  etc.  —  would  be  much  less  fully  developed  if  Scandi¬ 
navian  actuaries  had  not  been  concerned  with  stability  and  ruin. 

For  some  recent  stochastic  model  generalizations,  see  [80.8,  80.22, 

B8,  B9,  BIO,  Bll,  B12,  D12,  D15 ,  T14 ,  V2] . 

I  see  this  especially  in  my  own  specialty  of  operations  research, 
where  this  influence  has  been  felt  in  a  most  circuitous  way.  We 
know,  for  example  that  the  early  work  on  telephone  traffic  problems 
by  A.  Erlang  was  related  to  and  influenced  by  the  early  work  on 
risk  theory,  and  that  this  then  lead  to  the  earliest  developments 
of  mass  service  systems  —  what  we  now  call  queuing  theory.  For 
example,  Figure  5  shows  the  virtual  waiting  process ,  w(t)  ,  in  a 
single-server  queue,  defined  as  that  time  which  a  hypothetical 


customer,  just  joining  the  queue  at  time  t  ,  would  have  to  wait 
until  he  reached  the  front  of  the  queue,  and  began  service.  As 
time  progresses,  the  current  customer  in  service  is  completing 
his  service  at  a  uniform  rate  of  unity,  so  the  residual  waiting 
time  of  the  virtual  customer  is  also  decreasing;  but  then,  a 
new  (real)  customer  joins  the  queue  at  some  random  instant,  and 
adds  his  random  service  time  to  the  waiting  time  of  the  (bumped) 
virtual  customer.  Clearly,  this  stochastic  process  is  just  the 
mirror  image  of  the  risk  reserve  process,  and  often  has  similar 
underlying  assumptions. 

In  fact,  when  I  explain  the  risk  reserve  process  to  my 
students  in  operations  research,  one  of  the  first  things  they 
usually  say  is  that,  since  there  have  been  so  many  powerful 
advances  in  queuing  theory  in  the  last  20  years  (many  of  which 
they  have  studied) ,  obviously  they  can  be  of  tremendous 
assistance  to  actuaries  faced  with  a  reserve-setting  problem I 
Which  only  indicates  the  universality  of  a  good  scientific  paradigm, 
and  the  importance  of  good  communications.  I  have  looked  in  vain 
for  a  comprehensive  survey  of  this  insurance-telephone  traffic¬ 
queuing  theory  development;  perhaps  some  of  you  may  know  of 
references  in  addition  to  [H19,  S7,  T2] . 

Dividends  versus  Ruin  and  Other  Generalizations 

Returning  to  insurance  applications,  there  have  been  many 
generalizations  of  the  model  to  make  it  more  realistic.  For 
example,  it  was  quite  early  realized  that  if  the  risk  reserves 
get  too  high,  the  company  should  declare  dividends  to  its  stock¬ 
holders,  or  if  ruin  is  imminent,  it  should  take  some  corrective 
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action,  such  as  obtaining  new  financing;  this  then  leads  to  a 
random  walk  between  two  barriers,  and  various  "band  strategy" 
decision  problems.  See,  e.g.  [80.31,  76.8,  B37,  G4 ,  P3]  . 

Other  typical  generalizations  include  a  discounted  ruin 
probability  [A4],  compounding  assets  [E5] ,  the  use  of  credibility 
theory  [D21] ,  and  behavior  under  inflationary  conditions  [T9]. 

There  is  also  an  interesting  body  of  literature  which  takes  the 
most-used  approximation  forms  from  the  analytic  theory,  and  uses 
these  forms  to  suggest  general  rules  for  reserve  estimation  and 
regulation  [A5,  A6 ,  B25,  D14,  W5] . 

In  reviewing  the  many  contributions  to  stochastic  risk 
processes,  I  have  been  struck  at  how  closely  its  development 
parallels  the  normal  science  concept  proposed  by  Kuhn.  In  my 
opinion,  the  theory  has  now  far  outstripped  the  actual  applications 
in  insurance  and  elsewhere,  and  is  proceeding  as  an  active  area  of 
theoretical  activity  merely  because  of  the  attractive  difficulty 
and  beauty  of  the  model  —  in  short,  it  has  become  a  puzzle-solving 
activity.  In  fact,  the  basic  model  has  apparently  not  been  cali¬ 
brated  with  real-life  experiences;  to  quote  John  Wooddy  [L2] : 


"...I  should  like  to  see  more  cases  where  a  problem  is 
seeking  a  theory  and  fewer  where  a  theory  is  seeking  a 
problem.  I  had  an  interesting  conversation  at  the  recent 
ASTIN  Colloquium  with  Harald  Bohman,  who  has  written  exten¬ 
sively  on  ruin  theory.  He  asked  me  what  the  causes  of 
actual  insurance  company  insolvencies  have  been.  In  the 
United  States  in  the  past  ten  to  fifteen  years  we  have  had 
a  significant  number  of  failures  of  both  life  and  non-life 
companies.  We  thus  have  a  large  amount  of  data  for  studies 
of,  say,  causes  of  insurance  company  insolvency,  or  stages 
along  the  road  to  failure.  Here  is  a  prime  subject  for 
actuarial  research  whose  results  would  be  of  intense 
interest  to  many  people." 


I  do  not  intend  to  be  critical  of  the  pioneers  in  this  field  - 
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we  have  already  seen  the  singular  importance  of  their  work  to 
actuarial  science  and  to  other  scientific  communities.  Further¬ 
more,  it  is  an  intrinsically  interesting  paradigm  that  will 
continue  to  attract  attention  and  generate  communications. 

"Few  people  who  are  not  actually  practitioners  of  a  mature 
science  realize  how  much  mop-up  work  of  this  sort  a  paradigm 
leaves  to  be  done  or  how  quite  fascinating  such  work  can 
prove  in  the  execution"  [K7,  p.24]. 

I  believe  this  does  mean,  however,  that  for  further  practical 
advances  in  the  management  of  the  insurance  enterprise,  we  shall 
have  to  look  into  other  models  and  methods. 


1  **»*-•■ 


INSURANCE  OPERATIONS  PARADIGMS 

Premium  Setting 

We  turn  now  to  the  various  functional  areas  of  the  insurance 
company  where  the  skills  of  the  actuary  and  his  basic  models  are 
employed-  First  and  foremost  is  in  underwriting,  in  setting  premiums. 

Assuming  an  adequate  data  base,  then  the  first  step  is  to 
calculate  the  fair  (net,  pure)  premium  which  is  just  the  expected 
value  of  the  loss  under  the  contract  exposure,  using  one  of  the 
models  previously  discussed.  But  then,  especially  in  casualty 
applications  where  the  risk  is  immediate  and  the  variance  is 
high,  comes  the  idea  of  adding  a  security  (fluctuation)  loading 
to  the  fair  premium,  which  was  originally  motivated  by  the 
concepts  of  dynamic  risk  reserves  and  protection  against  stability 
just  discussed.  Three  simple  ideas  come  immediately  to  mind: 

(1)  One  can  simply  gross  up  the  fair  premium; 

(2)  If  the  risk  distribution  is  approximately  normal,  then 
we  can  protect  against  exceedance  of  a  given  loss  with 
a  certain  probability  by  adding  a  risk  loading  propor¬ 
tional  to  the  standard  deviation  of  random  outcome ; 

(3)  If  we  require  that  the  premium  be  additive ,  in  the 
sense  that  the  loaded  premium  for  the  coverage  of  two 
independent  risks  shall  be  the  same  as  the  sum  of  the 
individually  loaded  premiums,  then  (1)  is  still  appli¬ 
cable,  but  (2)  is  not  —  however,  we  may  use  a  loading 
proportional  to  the  variance  of  the  risk. 

This  means  that  the  fluctuation-loaded  premium  II  for  a  risk  x 
might  have  the  form 

II  =  (1  +  a)  Ex  +  b^vk  +  cl/x 


with  a  ,  b  ,  c  constants  chosen  from  stability  considerations 
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[76.6,  B17,  B20,  B37,  S 6 ] .  The  question  of  desirable  premium 
calculation  principles  is  still  under  active  discussion,  and  a 
variety  of  other  proposals  have  been  made,  such  as  the  use  of 
the  semi-variance  [B21]  and  concepts  from  capital  market  theory 
[Kl] .  We  shall  examine  the  application  of  utility  theory  ideas 
in  a  later  Section. 

The  next  step  in  premium  setting  is  to  determine  the  addicional 
50-200%  increase  which  determines  the  commercial  premium  by  adding 
expense  and  profit  loadings .  Except  in  life  insurance,  where 
there  are  specific  cost  models  for  sales  commissions  (in  many 
cases  of  regulated  form) ,  there  seems  to  be  no  further  modelling 
principle  used,  except  the  adjustment  of  the  factor  a  above,  or 
grossing-up  of  n  itself.  This  lacuna  in  the  literature  is  all 
the  more  surprising,  as  it  is  in  sharp  contrast  to  the  fields  of 
engineering  and  business  management,  where  extensive  and  sophisti¬ 
cated  cost  allocation  and  modelling  are  the  order  of  the  day. 

Are  these  activities  outside  of  the  realm  of  the  actuary,  or  are 
they  considered  extremely  Company  Confidential?  Perhaps  someone 
can  enlighten  me. 

Tariff  Construction 

I  am  using  tariff  construction  to  mean  that  underwriting 
activity  in  which  the  entire  structure  of  rates  between  related 
but  not  identical  risk  contracts  is  classified,  compared,  and 
rationalized  by  further  adjustments  —  for  example,  in  setting 
the  physical  damage  portion  of  automobile  insurance  by  comparison 
across  the  class  of  all  different  vehicles  covered,  including 
their  special  characteristics  such  as  cost,  cost  of  repair,  size. 


horsepower,  etc.,  plus  characteristics  of  the  driver (s)  and  the 
use  to  which  the  vehicle  is  put,  where  it  is  garaged,  and  so  on. 

It  seems  to  me  that  this  structuring  away  from  purely  individual 
risk  setting  is  done  for  two  reasons: 

(1)  To  provide  a  smoothness  or  functional  form  of  premium 
variation  across  the  variation  of  the  risk  character¬ 
istics  which  approximates  some  desirable  statistical 

or  physical  law,  giving  the  structure  a  robust,  rational, 
and  defensible  form;  or 

(2)  For  competitive  reasons  in  comparison  against  another 
company's  or  the  industry-average  tariff  structure. 

It  should  also  be  admitted  that  for  many  lines  of  insurance,  there 
are  am  extremely  large  number  of  different  rates  to  be  determined 
by  many  small  companies  who  cannot  afford  the  continuing  service 
of  an  actuary,  and  that  the  underlying  tariff  structure  is  essen¬ 
tially  furnished  by  an  industry-wide  rating  bureau. 

The  first  problem  in  tariff  construction  is  to  select  the 
risk  factors  on  which  the  structure  will  be  based.  Where  these 
variables  are  not  given  by  law,  tradition,  or  public  policy,  this 
search  can  be  rather  complex  and  "artistic",  limited  only  by  the 
imagination  of  the  actuary  (and  the  company's  sales  force). 

Usually,  various  proposals  are  made,  and  then  examined  by  statis¬ 
tical  regression  or  cluster  analysis  methods  within  the  preferred 
model  structure  which  is  usually  an  additive  or  multiplicative 
model  with  a  certain  number  of  free  parameters.  The  statistical 
parameter  estimation  method  chosen  provides  the  mechanism  for 
ranking  the  influence  of  the  various  factors  and  for  suggesting 
which  ones  should  be  dropped,  although  there  are  usually  various 
technical  problems  relating  to  the  independence  of  the  factors 
and  the  validity  of  the  model  form  chosen  (80.43,  80.46,  76.2, 
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76.3,  A2 ,  H6,  K2,  L8 ,  P10] .  This  structure  is  then  presented  to 
management,  the  regulating  agencies,  and  the  general  public,  and 
further  iterations  are  made  as  necessary  to  develop  an  acceptable 
risk  classification  and  tariff  structure. 

Speaking  as  an  engineer-physicist,  I  must  say  that  I  am  not 
very  satisfied  with  these  blind,  statistically-based,  "try-it-and- 
see"  procedures.  For  one  thing,  there  are  few  general  principles 
involved  [R3,  S4]  to  help  guide  this  search,  and  to  give  insight 
into  the  business  management  trade-offs  between  factors  included, 
and  factors  dropped.  For  another,  there  is  usually  very  little 
physical  or  economic  motivation  behind  the  choice  of  an  additive 
or  multiplicative  model,  except  possibly  the  reduction  of  costs 
through  administrative  simplicity,  or  for  ease  in  statistical 
estimation.  And,  there  is  always  the  problem  of  risk  factors 
that  are  overlooked  [A3] . 

In  a  few  lines  of  casualty  insurance,  there  are  basic  physical 
laws  which  seem  to  be  guiding  tariff  construction.  In  fire  insur¬ 
ance,  there  is  not  only  experimental  knowledge  about  the  inflamma¬ 
bility  of  the  structure  and  its  contents,  there  are  certain 
physical  dimensions  and  relationships  of  volumes,  together  with 
experience  in  using  protective  devices  and  sprinkler  systems  and 
information  about  fire-department  response  times,  which  enable 
decomposition  of  the  problem  into  the  probability  of  ignition,  the 
rate  of  spreading  and  "contagion",  the  ultimate  damage  potential, 
and  the  observed  degree  of  damage  [B13,  B15,  J20,  J21,  R2,  S15,  S16] . 
There  is  also  am  obvious  motivation  for  using  the  theory  of  extreme 
values  in  examining  fire  losses,  and  other  catastrophic  situations, 
such  as  earthquakes  and  floods  [80.30,  76.15].  In  my  opinion, 
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actuaries  could  benefit  from  working  more  with  engineers  and 
physical  scientists  who  have  a  lot  to  say  about  the  physical  laws 
and  risk  factor  relationships  which  hold  for  fire,  collisions, 
explosions,  earthslides  and  -  earthquakes ,  floods,  windstorms  and 
hail,  contamination  and  pollution,  radioactivity  effects  and 
disperal,  occupational  health  and  safety,  rehabilitation  manage¬ 
ment,  and  so  forth.  Every  large  insurance  company  has  specialists 
in  these  areas,  of  course,  but  they  are  busy  with  contract  and 
claims  administration  matters,  rather  than  influencing  tariff 
construction.  This  is  rather  like  driving  an  automobile  by 
looking  out  the  rear-view  window  only. 

Also,  none  of  these  approaches  provides  the  economic  and 
competitive  rationale  for  moving  income  and  profitability  between 
different  risk  classifications  for  the  sake  of  structural  consis¬ 
tency.  In  [J7] ,  I  made  a  modest  proposal  that  perhaps  the  under¬ 
writing  specialists  could  express  their  preferences  through  simple 
inequality  relationships  between  the  classes  —  e.g.,  that  the 
premium  should  be  a  non-decreasing  function  of  horsepower?  however, 
this  model  has  sunk  without  a  trace. 

In  the  next-to-last  Section,  I  shall  examine  the  conflict 
between  previously  acceptable  risk  classification  procedures 
and  recent  shifts  in  public  opi  1  and  societal  objectives. 

Underwriting  Exposure  and  Risk  Selection 


There  appear  to  be  few  general  models  which  can  be  used  to 
analyze  underwriting  exposure  directly;  yet,  redesign  of  risk 
contracts  is  badly  needed,  for  example,  in  product  liability  [01], 

A  few  recent  works  have  appeared  on  the  decision  problem  of  offering 
or  withdrawing  a  risk  contract  [80.29,  76.11,  G9]  . 
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Bonus/Malus  and  Consumer  Behavior 


Most  insurance  underwriting  recognizes  the  possibility  of 
rate  revision  for  the  individual  risk,  based  upon  retrospective 
experience  rating.  In  a  later  Section,  we  will  consider  the 
Bayesian  approach  to  this  problem  through  Credibility  Theory. 

The  focus  of  this  Section  will  be  on  Bonus/Malus  systems,  where 
a  policyholder  is  moved  from  one  premium-rating  class  to  another, 
based  upon  his  immediate  past  experience;  for  example,  in  auto¬ 
mobile  insurance,  it  is  usually  only  the  number  of  recent  claims 
which  affects  the  policyholder's  transition  from  one  class  to 
another.  These  systems  are  usually  posed  empirically,  and  then 
analyzed  through  Markov  chain  techniques  [S6] . 

It  turns  out  that  most  automobile  bonus/malus  systems  currently 
in  use  do  not  seem  to  be  very  good  risk  discriminators,  and  various 
methods  have  been  proposed  to  analyze  and  improve  their  efficiency 

[D5,  L16,  N3] .  Interestingly,  recent  communications  have  explicitly 
recognized  a  well-known  aspect  of  consumer  behavior  —  namely, 
that  if  not  reporting  or  under-reporting  a  claim  will  affect  future 
premiums  in  a  way  known  to  the  policyholder,  then  his  "hunger 
(thirst)  for  bonus"  will  cause  him  to  choose  that  strategy  which 
is,  in  total,  least  costly.  Assuming  that  the  rational  consumer 
will  behave  this  way,  then  leads  to  a  new  series  of  actuarial 
models  in  which  this  effect  is  taken  into  account  [D6,  H2,  H3 ,  L5 , 
L7,  N2] .  This  bonus  hunger  is  also  recognized  in  other  insurance 
settings  where  there  is  a  consumer  choice  of  deductibles  and  the 
possibility  of  "anti-selection"  exists  [A7,  S21] . 


Claim  Reporting  Delays 

An  interesting  new  model  has  recently  appeared,  beginning 
essentially  with  a  Boleslaw  Monic  Fund  Competition  in  1971  [Nl]  , 
to  help  with  a  long-recognized  problem  with  certain  long-duration 
claims,  namely,  the  delays  in  reporting  the  numbers  of  total 
amounts  of  the  losses  —  the  so-called  IBNR  (Incurred  But  Not 
Reported)  problem  [80.17,  B6 ,  D16,  K6 ,  S19,  T3 ,  T6]  . 

The  situation  is  depicted  in  the  "Run-Off  Triangle"  in 
Figure  6.  If  we  imagine  that  all  reporting  is  quantized  according 
to  some  common  accounting  period ,  say ,  the  calendar  year ,  then 
in  a  given  Observation  Year  the  vertical  axis  represents  the 
interval  in  which  the  claim  event  was  incurred  —  the  Accident 
Year  —  and  the  horizontal  axis  represents  the  number  of  periods 
intervening  —  the  Development  Years  —  since  the  claim.  Obviously 
earlier  events  have  had  more  chance  to  "develop"  than  later  ones; 
if  we  place  in  the  corresponding  cells  the  cumulative  claim  costs 
from  a  given  Accident  Year,  then  we  can  empirically  observe  the 
"run-off",  as  the  figures  mount  toward  their  asymptotic  totals 
with  each  Development  Year.  Note  that  each  additional  Observation 
Year  is  represented  by  a  diagonal  line  in  this  triangle;  the 
literature  contains  many  of  the  23  other  ways  of  representing  the 
triangle. 

As  with  all  useful  paradigms,  this  model  was  quickly  general¬ 
ized  and  extended  to  include  many  features  of  interest.  The  basic 
modelling  problem  is  to  describe  how  each  year's  loss  components 
are  generated,  compared  with  previous  experience  for  the  same 
Accident  Year  (e.g.. 


row-wise  run-off),  and  what  effects,  such  as 
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inflation,  may  affect  losses  reported  in  the  same  Observation  Year 
(e.g.,  diagonal  coupling  effects).  In  fact,  as  Buhlmann,  Schnieper, 
and  Straub  [B45]  point  out,  there  are  two  separate  delay 
effects  —  the  Incurred  But  Not  Yet  Reported  (IBNYR)  effect  due 
to  initial  delays  in  filing  claims  with  the  reporting  system  [F2,  K4] , 
and  the  Incurred  But  Not  Fully  Reported  ( IBNFR)  effect,  due  to  the 
manner  in  which  claim  costs  are  generated  by  the  loss  (as  in 
worker's  compensation  and  health  insurance),  but  also  due  to  the 
delays  in  the  claim  processing  and  the  reporting  system. 

Retention  and  Reinsurance 

If  a  given  portfolio  of  risks  has  too  much  variability  relative 
to  the  reserves  of  a  company,  then  that  company  will  normally  enter 
into  a  risk-sharing  agreement  with  another  carrier,  or  obtain  re¬ 
insurance;  we  will  examine  the  optimal  forms  of  such  contracts 
under  utility  theory  assumptions  in  a  later  Section. 

But  relatively  early  models  (see  e.g.  [B7,  S 6 ] )  revealed  that, 
under  reasonable  assumptions ,  a  company  should  prefer  to  "lay  off" 
the  tail  x  >  M  of  its  risk  x  ,  and  retain  only  the  reduced  risk 
y  »  min(x,M)  —  e.g.,  it  should  obtain  Stop-Loss  reinsurance. 

Other  authors  have  studied  other  objectives  and  forms  of  retention 
(see,  e.g.  [B16,  B18,  B19,  L4 ,  S20,  Wl] ) . 

Determining  the  properties  of  y  as  a  function  of  M  when 
x  is  a  compound  risk  sum  s  turns  out  to  be  one  of  those  inter¬ 
esting  puzzles  which  continues  to  occupy  actuarial  attention  [80.15, 
B42,  C4,  Gl,  G10,  G14,  H9 ,  T4 ,  VI].  Recent  papers  have  considered 
the  connection  with  utility  theory  (see  later)  [80.12,  80.23,  G14]  .. 


Other  Operations  Areas 


At  this  point  there  is  a  lacuna  in  the  actuarial  literature, 
with  the  next  level  of  models  oriented  towards  planning,  management, 
and  investment  problems.  Except  for  some  simple  models  of  sales 
office  and  agent  operations,  and  an  interesting  paper  on  loss 
prevention  [S17],  there  seems  to  be  little  activity  devoted  towards 
the  daily  problems  of  Underwriting  Services,  Claims,  Data  Process¬ 
ing,  Records  Plant,  Inspection,  Engineering,  Medical,  Rehabilitation, 
Legal,  District  Offices,  and  all  the  other  operating  divisions 
of  a  large  modern  company.  Perhaps  it  is  felt  these  problems  are 
the  purview  of  other  specialists,  or  that  insufficient  holistic 
models  of  the  entire  firm  are  available.  Certainly  we  know  that 
the  costs  of  these  departments  affect  corporate  profitability,  and 
that  the  portfolio  of  risks  supervised  by  the  actuary  is  what 
entrains  these  costs.  As  a  systems  engineer,  I  feel  instinctively 
that  the  actuary  should  at  least  be  aware  of  his  influence  on  all 
aspects  of  company  operations.  Here  are  some  questions  to  think 
about:  What  would  be  the  affect  on  your  company's  profitability 
if  claim  reporting  delays  were  cut  in  half?  What  would  be  the 
effect  if  loss  prevention  activities  were  doubled?  How  much  would 
each  of  these  changes  cost? 
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INSURANCE  BUSINESS  PLANNING  AND  MANAGEMENT  PARADIGMS 


Basic  Accounting  Models 

The  past  decade  has  seen  a  great  deal  of  interest  in  developing 
analytic  and  simulation  models  for  the  management  of  the  business 
enterprise.  In  fact,  a  simplified  model  of  the  firm  has  been  in 
use  for  many  years,  represented  by  the  accounting  equation 

PI  =  CP  +  OE  +  UP  ; 


that  is,  premium  income  balances  claims  paid  plus  operating  expenses 
plus  underwriting  profit,  which  holds  over  the  long  run  either  by 
adjusting  premiums  (rate-making) ,  reducing  claims  (improved  under¬ 
writing  and  loss  reduction) ,  or  reducing  expenses  (cost-cutting) . 

This  insurance  "production  function"  was  assumed  to  be  linear  over 
a  wide  range  of  business  volume,  with  the  loss  ratio  CP/PI  and  the 
expense  ratio  OE/PI  almost  universally  accepted  as  stable  corporate 
objectives  and  measures  of  performance. 

The  simple  static  accounting  model  needs,  of  course,  to  be 
expanded  to  include  investment  income,  and  the  associated  dynamic 
versions  include  inflation,  changes  in  reserves,  run-off  profit, 
etc.  Led  by  the  deceptively  simple  yet  elegant  ideas  of  Harald 
Bohman,  there  continue  to  be  fresh  insights  into  model  office 
analysis  and  understanding  of  management  accounting  and  control 
problems  through  these  dynamic,  deterministic  models  [80.6,  80.33, 
80.37,  80.38,  76.7,  B26,  B27 ,  B28,  E3] .  In  many  cases,  these 
linear  accounting  relationships  can  be  used  to  analyze  the  assoc¬ 
iated  variances,  and  then  reserve  levels  and  profitability  standards 
can  be  set  by  specifying  a  certainty-equivalent  protection  level ? 
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we  might  call  this  the  safety  margin  or  deterministic-plus  approach 
to  modelling. 

Solvency  and  Regulation 

If  one  may  judge  by  the  number  of  papers  submitted  to  Topic  1 
of  this  Congress,  the  question  of  appropriate  modelling  for 
reserve  setting,  solvency  margins,  financial  stability,  and 
external  regulation  is  one  of  the  hot  issues  in  actuarial  circles 
[80.4,  80.6,  80.7,  80.11,  80.20,  80.24,  80.25,  80.27,  80.32,  80.35] 

As  I  understand  it,  the  problem  is  one  of  managing  the  rela¬ 
tionship  between  different  types  of  reserves,  especially  between 
the  technical  reserves  (such  as  the  fluctuation  reserve,  claim 
reserves  related  to  IBNR  estimation,  etc.)  that  are  specified  by 
traditional  actuarial  models,  and  the  legal  reserves  that  are 
required  to  guarantee  the  solvency  of  the  corporation  under 
changing  conditions  which  are  outside  the  usual  paradigms.  To 
an  increasing  extent,  these  latter  reserves  are  being  mandated 
by  regulatory  agencies  based  upon  rather  simplistic  "maximum 
probable  loss”  or  "deterministic-plus"  concepts. 

Inasmuch  as  this  area  is  still  under  active  development,  and 
will  be  the  theme  of  one  of  the  discussion  sessions,  I  will  not 
comment  further  at  this  time.  However,  I  draw  your  attention  to 
some  recent  work  on  models  for  the  surveillance  of  solvency  in 
the  insurance  industry  using  discriminant  analysis  [C2,  G17,  P9] . 


Projection  and  Simulation 
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results  and  for  the  simulation  of  the  overall  operation  of  an 
insurance  company,  including  sales  and  investment  performance. 

The  papers  offered  on  this  topic  will  form  another  of  the  dis¬ 
cussion  sessions  today  [80.2,  80.6,  80.10,  80.14,  80.18,  80.19, 
80.34,  80.35,  80.39]. 

Most  of  these  working  programs  and  simulation  proposals 
seem  to  be  based  on  the  deterministic,  linear  accounting  models 
just  discussed,  with  the  addition  of  random-number  generators  to 
represent  mortality  or  other  risk  mechanisms,  and  investment  and 
inflation  variability.  While  it  is  important  to  encourage  these 
proposals  at  this  early  stage  of  their  development,  I  must  say 
that  they  are  still  relatively  unsophisticated  by  comparison  with 
other  business  and  engineering  management  simulations.  I  am 
particularly  concerned  that  there  has  been  insufficient  modelling 
of  the  sales  function,  underwriting  risk  selection  dynamics,  claims 
management  and  loss  reduction  activities,  and  insurance  services 
cost  components,  to  be  able  to  rely  upon  a  linear  production 
function  and  to  use  past  operating  ratios  for  the  projection  of 
future  performance.  I  would  hope  that  many  actuaries  would  be 
challenged  to  develop  appropriate  sub-models  for  each  of  their 
company's  operations  areas,  for  use  in  the  overall  simulation.  A 
related  problem  will  be  the  need  to  develop  simulation-support 
data-collection  schemes,  as  many  of  the  traditional  data  bases 
and  management  reporting  systems  currently  in  use  are  based  on 
the  simple  accounting  paradigm. 


Stochastic-Dynamic  Models 


simulation  is  taken  by  T.  Pentikainen  and  his  coworkers  [80.25, 
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76.14,  B7,  P4 ,  P5,  P6] .  Based  upon  the  Dynamic  Risk ' paradigm, 
and  using  only  a  stochastic  simulation  of  total  claims  and  an 
empirical  sales  response  model,  their  approach  permits  the  rapid 
exploration  of  the  joint  effect  of  a  number  of  decision-variables, 
such  as  reserves,  dividends,  and  sales  effort  upon  the  evolution 
of  corporate  perf ormance .  This  approach  is  an  appealing  one  to 
me,  because  it  is  "top  down"  decision-oriented  modelling,  rather 
than  an  effort  to  create  a  comprehensive  simulation  "ground  up", 
and  to  let  it  run  in  an  unstructured  way.  Apparently,  experience 
with  the  simulations  has  permitted  the  authors  to  make  certain 
simplifications  in  the  model  structure  by  appeal  to  known  analytic 
results,  and  by  using  dynamic  programming  control  methods  (see 
also  [76.8,  F10] ) . 

The  counterargument  to  this  approach  is  that  it  is  relatively 
sophisticated  in  concept,  and  more  difficult  to  explain  to  managers 
than  a  financial-report-oriented  simulation.  In  fact,  very  few 
insurance  modellers  have  reported  on  their  successes  and  failures 
in  selling  their  results  to  their  management.  Is  this  because 
most  of  the  work  to  date  has  not  been  supportive  of  actual  business 
decisions,  or  is  it  because  the  practical  details  of  successful 
applications  do  not  make  interesting  communications?  An  early 
paper  [M4]  suggests  that  management  gaming  may  be  an  effective 
communications  device,  but  I  have  not  seen  any  of  these  management 
exercises  reported  in  the  insurance  field  either. 

I  would  also  like  to  call  your  attention  to  [76.16],  which 
has  some  interesting  criticisms  of  the  ultimate  applicability  of 
purely  mathematical  methods  to  insurance  management.  The  study 
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of  how  management  objectives  change  in  the  face  of  market  forces 
is  of  course  an  interesting  topic  in  itself,  but  about  which  there 
has  been  little  discussion.  In  fact,  the  important  management 
function  of  market  development  and  sales  strategy  is  greatly 
underrepresented  in  the  insurance  literature  ,  by  comparison  with 
other  management  science  applications.  Is  this  because  actuaries 
do  not  participate  in  these  decisions,  or  are  these  activites 
considered  to  be  too  sensitive  for  general  communication? 

We  turn  now  to  concepts  and  methods  from  other  disciplines 
which  have  influenced  insurance  modelling. 
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THE  INFLUENCE  OF  COMPUTERS 

The  advent  of  the  modern  high-speed  digital  computer  has 
revolutionized  every  scientific  field,  not  only  eliminating 
old  computational  burdens,  but  obsoleting  traditional  procedures, 
greatly  extending  the  sophistication  of  solvable  models,  and 
affecting  future  attitudes  and  progress  in  ways  which  can  only 
be  dimly  perceived. 

Two  examples  from  actuarial  science  come  to  mind.  The  late 
David  Halmstad  was  a  strong  proponent  of  a  high-level  computing 
language  called  APL,  and  delighted  in  pointing  out  its  simplicity 
and  power  in  computing  actuarial  functions.  As  an  example, 
suppose  that  I  is  a  scalar  variable  representing  the  annual 
interest  rate  in  %,  and  Q  is  an  arbitrary-length  vector  variable 
representing  the  probability  of  death  at  the  end  of  year  0,1,2,... 
Then  the  following  two  lines  of  program: 

[1]  5-*-4>+\4>(V'*-4>+\<$>D-*-Sx ( 1  +  0 .0ixJ)*-0  ,  i~l+p<2 

[2]  E+-<H\<WH>  +  \4>C+-(  0*1  +  0  .  Olxl)  -1  4-0,0 

will  calculate  the  complete  commutation  functions  D,  N,  S,  C,  M,  R 
for  this  arbitrary  mortality  table  and  interest  in  about  10  seconds 
on  my  desk-top  computer,  about  the  same  time  it  takes  me  to  open 
a  text  and  find  one  value  for  a  given  table  and  interest! 

The  other  example  involves  calculating  the  first  two  moments 
of  the  one-year  loss  distribution  in  a  large  life  insurance  port¬ 
folio.  While  the  mean  is  straightforward,  the  presence  of  a 
large  number  of  different  face  values  and  mortality  rates  makes 
the  calculation  of  the  variance  untidy;  actuaries  have  for  many 
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years  used  an  approximation  formula  based  upon  grouping  of  the 
data.  Two  years  ago,  Hans  Biihlmann  and  I  talked  with  an  actuary 
in  San  Francisco  who  was  going  to  carry  out  this  calculation  on 
his  company's  computer,  and  discovered  that  he  had  a  few  milli¬ 
seconds  of  idle  central  processor  unit  time  while  reading  the 
policy  master  file.  So  he  wrote  a  small  program  which  "for  free" 
calculated  the  exact  distribution  of  losses  for  the  entire 
portfolio  through  convolution! 

Let  me  pose  the  following  scenario,  which  I  do  not  think 
is  at  all  unreasonable;  imagine  that,  by  1985: 


(1)  All  university  graduates  will  be  quite  sophisticated 
in  the  programming  and  use  of  computers ,  including 
scientific  computations,  such  as  large-scale  optimiza¬ 
tion  and  stochastic  simulations,  as  well  as  basic  data 
management  principles; 

(2)  The  economics  of  digital  computation  will  have  continued 
to  drop  to  such  a  point  that  any  of  the  calculations 

or  simulations  that  have  been  proposed  at  this  Congress 
can  be  routinely  accomplished  in  a  few  seconds  —  and 
further,  that  hand-held  calculators  or  local  mini¬ 
computers  will  be  able  to  compute  all  of  the  life  and 
many  of  the  non-life  actuarial  functions  instantaneously 
upon  demand; 

(3)  That  extensive  computer  communications  networks  will 
enable  the  global  transmission  (via  satellite)  of 
insurance  experience  data  banks,  economic  variables, 
industrial  and  trade  statistics,  etc.,  at  nominal 
cost. 


My  questions  to  you  are:  How  should  the  actuarial  student  then 
be  trained?  And  what  will  you  have  him  do  when  he  joins  your 


company? 


THE  INFLUENCE  OF  ECONOMICS 


The  Utility  Theory  Paradigm 

Insurance  is,  we  hope,  an  economic  enterprise;  and  yet  the 
two  fields  had  surprisingly  little  to  say  to  each  other  until 
the  expected  utility  hypothesis,  first  proposed  by  Daniel  Bernoulli 
in  1732,  was  given  adequate  justification  by  J.  von  Neumann  and 
0.  Morgenstern  in  1947  (see  [B29]);  in  this  important  work,  they 
gave  a  set  of  behavioristic  assumptions  which  showed  how  a 
"rational  economic  man"  (REM)  would  consistently  choose  between 
any  two  random  outcomes  with  known  distributions,  say,  between 
x  and  y  .  Given  these  assumptions,  together  with  some  technical 
fine  points,  it  follows  that  a  REM  would  behave  as  if  he  evaluated 
the  outcomes  by  using  a  personal  utility  function,  u(.)  ,  mapping 
the  outcomes  and  their  associated  probabilities  into  a  single 
scalar  value  used  for  comparison.  In  other  words,  (if  large  values 
of  outcome  are  desirable)  an  REM  would  consistently  prefer  a 
random  outcome  x  to  a  random  outcome  y  if  and  only  if 

U  =  Eu(x)  >  Eu(y)  =  U  , 
x  y 

for  some  nondecreasing  function  u(.)  .  Because  this  paradigm 
only  purports  to  rank  outcomes,  the  scale  of  u(.)  is  undefined; 
a  utility  function  v(x)  *  au(x)  +  b  (a  >  0)  will  give  the 
same  preference.  Therefore  one  cannot  simply  say  that  an  individual 
should  have  a  nonlinear  preference  for  money. 

Although  there  have  been  many  attacks  on  this  paradigm,  based 
usually  upon  experiments  in  which  individuals  can  be  tricked  into 
violating  the  REM  hypothesis,  it  has  still  withstood  the  test  of 
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time  reasonably  well.  Moreover,  it  is  a  useful  paradigm  in  insur¬ 
ance,  as  has  been  demonstrated  repeatedly  by  Karl  Borch  [80.47, 
76.8,  B29,  B30,  B31,  B32,  B33,  B34]  and  many  others,  in  a  variety 
of  different  applications. 

Demand  for  Insurance 

The  first  success  of  this  paradigm  was  in  satisfactorily 
"explaining"  why  anyone  would  buy  insurance  against  a  random 
loss  x  and  be  willing  to  pay  more  than  the  "fair"  expected 
value.  Ex  .  Suppose  an  individual  with  basic  wealth  w  is  faced 
with  such  a  loss ,  but  can  purchase  an  insurance  policy  to  cover 
the  loss  at  cost  II  ;  essentially,  this  is  a  problem  in  choosing 
between  a  random  outcome  W  -  x  and  a  deterministic  outcome 
W  -  H  .  According  to  the  von  Neumann-Morgenstern  paradigm,  a 
rational  economic  man  would  buy  the  insurance  only  if  the  premium 
were  less  than  the  indifference  value  II  given  by  the  solution 
of  u(W  -  n)  =  Eu(W  -  x)  .  It  is  a  relatively  simple  matter  to 
show  that  such  an  indifference  value  greater  than  EX  exists  for 
all  W  only  if  u"  (x)  <  0  ,  that  is,  if  the  individual  has  a 
concave-downward  utility  curve,  when  he  is  said  to  be  "risk¬ 
avoiding".  It  is  also  possible  to  show  that  if  the  range  of 
outcomes  is  not  too  large,  then  the  fluctuation  loading,  n  -  Ex  , 
an  individual  is  willing  to  pay  to  get  insurance  is  proportional 
to  the  variance  of  the  loss  and  the  "risk-aversion  coefficient" 
-u"/u'  ,  evaluated  at  the  reduced  wealth  W  -  n  [Pll] . 

The  same  approach  can  of  course  be  applied  to  a  (rational) 
insurance  company  to  set  acceptable  premium  limits  over  which  it 
will  underwrite  a  given  risk,  given  its  current  reserves  and 
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portfolio.  Luckily,  since  the  risk  aversion  coefficient  of  a 
company  is  usually  much  less  than  that  of  an  individual,  there 
is  usually  room  for  negotiating  a  commercially  viable  actual 
premium. 

Premium  Calculation  Principles 

The  above  result  on  a  REM's  permissible  fluctuation  loading 
is,  of  course,  reminiscent  of  our  previous  discussion  of  premium 
setting,  where  a  loading  proportional  to  variance  was  justified 
on  the  basis  of  additivity.  Led  by  the  works  of  H.  Gerber,  a 
variety  of  recent  papers  have  explored  the  variety  of  abstract 
properties  which  one  might  require  of  a  premium  calculation 
principle,  including  the  utility  theory  approach  [80.12,  76.10, 

F8 ,  G3 ,  G5,  G14 ,  G16 ,  L3] . 

One  interesting  result  is  that,  if  we  require  that  the 
utility  theory  premium  n  be  independent  of  the  individual's 
wealth  (or  the  company's  reserves  and  current  portfolio),  the 
associated  utility  function  must  either  be  proportional  to  u(x)  ~  x 
(the  expected  value  principle)  ,  or  u  (x)  =  c""1  [  1  -  exp  (-cx)  ]  , 
the  exponential  utility  principle.  In  the  latter  case,  the  risk 
aversion  coefficient  -u"/u'  is  just  the  constant  c  over  all 
values  of  outcomes,  and  the  utility  premium  is  simply  II  = 
c  ^  ln[E  exp(cx)}  .  This  model  is  now  being  applied  to  a  variety 
of  traditional  risk  problems,  and,  like  all  alternative  paradigms, 
at  least  forces  the  scientist  to  rethink  his  basic  assumptions; 
no  doubt  we  shall  see  more  discussion  on  this  point. 
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Risk-Sharing  and  Game  Theory. 

The  utility  theory  paradigm  also  gives  a  fresh  viewpoint  on 
the  problem  of  risk  retention  and  optimal  reinsurance  and  risk¬ 
sharing  arrangements.  The  basic  results  are  due  to  Borch  [B31] . 
Suppose  that  we  have  a  group  of  n  insurance  companies,  indexed 
by  i=l,2,...n  ,  each  of  which  is  facing  a  random  outcom  x^  , 
and  behaves  as  if  it  has  a  risk-avoiding  utility  function  u^  ( . )  . 
The  question  is :  under  what  circumstances  can  they  agree  to  form 
a  risk-exchange  (REX)  or  risk  pool ,  in  which  company  i  will 
now  assume  the  random  risk  y.  =  y . (x, x  )  ?  We  will,  of 
course ,  require  that  the  treaty  (contractual  agreement)  functions 
{yi(.)}  be  such  that  all  claims  are  paid,  i.e.  +  *2  +  ...  +  xn 

Yl  +  y2  +  • • •  +  Yn  •  (Reinsurance  arrangements  are  a  variation 
of  this  model  in  which  conservative  side  payments  are  permitted.) 

Two  interesting  results  obtain.  The  first  is  that,  if  there 
is  any  treaty  which  improves  the  expected  utility  of  all  companies, 
then  there  are  many  such  treaties,  defining  a  Pareto-optimal  set 
of  arrangements  over  which  the  companies  must  bargain  for  individual 
advantage.  On  the  other  hand,  the  treaties  depend  only  upon 
the  sum  of  the  pooled  outcomes,  x^  =  x^  +  x2  +  ...  +  xn  ,  even 
if  the  outcomes  are  statistically  dependent. 

Even  though  the  exact  REX  is  not  specified  by  this  model, 
the  form  of  the  y^(x.)  are  given  in  terms  of  the  individual 
u^C.)  ;  for  example,  if  all  utilities  are  exponential,  then  linear 
(.quota).  risk-sharing  takes  place,  so  that  y\  *  a^  +  b^  with 
the  a^  related  to  the  individual  risk-aversion  coefficient. 

The  indeterminacy  in  Borch' s  result  is  reflected  in  the  fact  that 
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the  side  payments  (Zb^  =  0)  are  still  open  to  negotiation, 

and  must  be  determined  in  terms  of  some  other  model,  for  example, 
by  reference  to  the  theory  of  games.  A  variety  of  papers  have 
explored  the  various  implications  of  this  Pareto-optimal  solution 
(see,  e.g.  [76.13,  76.19,  B31,  B32,  B43,  L6 ,  M6 ,  P12,  R5]). 

Gerber  [G12]  was  the  first  to  add  inequality  side  conditions 
to  the  REX  model,  which  limits  the  possibility  that  the  treaty 
will  permit  the  invasion  of  reserves  of  an  individual  company. 

This  modification  leads  to  the  optimality  (under  the  utility  para¬ 
digm)  of  the  stop-loss  contract  for  simple  reinsurance,  and  gives 
linear  quota-sharing-by-layers  for  the  general  REX  treaties  under 
exponential  utilities;  the  game-playing  arbitrariness  is  now 
reflected  in  the  unspecified  layer  values.  Recently,  the  author 
and  Hans  Biihlmann  have  attempted  to  remove  this  element  of  negotia¬ 
tion  by  invoking  an  additional  principle,  namely,  that  the  REX 
also  be  a  fair  Pareto-optimal  risk  exchange,  that  is,  the  treaties 
must  satisfy  Ey^  =  Ex^  .  This  mixture  of  the  utility  theory 
paradigm  and  a  traditional  insurance  business  concept  is  somewhat 
confusing  at  first  glance,  but  it  does  lead  to  unique  risk  sharing 
treaties,  for  example  the  unique  determination  of  the  layers  in 
the  exponential  cases.  Unfortunately,  if  the  participants  are 
grossly  mismatched  in  risk-capacity  and  risk-aversion,  then  this 
unique  solution  may  not  improve  the  expected  utility  of  all  par¬ 
ticipants,  and  the  pool  will  not  form?  further  details  may  be 
found  in  [B44] . 

The  fact  that  this  new  model  leads  to  justification  of  some 
of  the  reinsurance  and  risk  pool  treaty  forms  actually  used  in 
practice  has  generated  considerable  interest,  and  there  will  no 


doubt  be  continued  development  of  these  ideas  [B5,  B41] . 

Notice  also  that  there  is  a  considerable  economic  literature 
developing  in  the  area  of  agent/principal  risk-sharing  agreements, 
with  emphasis  on  incentive  fee  structures,  and  problems  of  partial 
observability  (see,  e.g.  [H18,  S14]). 


THE  INFLUENCE  OF  OTHER  DISCIPLINES 


A  number  of  other  disciplines  have  contributed  concepts, 
models,  and  methods  to  insurance.  For  example,  I  am  pleased  to 
see  that  many  papers  now  reference  my  own  field  of  operations 
research,  and  that  many  national  societies  include  at  least 
some  exposure  to  O.R.  in  their  training  recommendations.  However, 
many  actuaries  and  most  of  the  recent  articles  seem  to  equate 
O.R.  with  a  collection  of  analytic  methods  [W6] ,  such  as  mathema¬ 
tical  programming  (optimization)  [80.16,  80.28,  J7 ,  S3,  S5] , 
dynamic  programming  [P6] ,  linear  systems  [80.16,  G15]  ,  queuing 
analysis,  reliability  theory,  decision  analysis  [S16] ,  and  so  on. 
This  is  not  at  all  what  I  had  in  mind  when  I  surveyed  O.R. 
applications  in  the  insurance  industry  in  1972  [J2]  and  tried  to 
stress  the  model-building  opportunities  in  the  operations  and 
management  areas  of  the  insurance  enterprise.  Even  though 
operations  research/management  science  societies  continue  to 
sponsor  sessions  on  insurance  models,  they  too  seem  to  be  mostly 
methodology-oriented,  rather  than  demonstrating  the  constructive 
interaction  I  had  hoped  for  between  the  two  communities.  Perhaps 
operations  research  has  itself  gone  too  far  in  its  own  pursuit  of 
normal  science  [J18] . 

Contributions  to  this  Topic  also  reveal  concepts  from  other 
disciplines  such  as  information  theory  [80.5],  systems  and  cyber¬ 
netics  theory  [80.3,  80. 4Q],  control  theory  [80.36],  and  futurism 
[80.21].  I  believe  that  it  is  important  to  keep  these  dialogues 
open  with  other  fields,  for  one  is  never  able  to  predict  where 
the  next  successful  paradigm  will  be  generated. 
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Of  course,  probability  and  statistics  has  been  a  continuing 
source  of  techniques  and  ideas  for  actuarial  science.  Because 
this  area  is  the  subject  of  Topic  2,  I  will  confine  my  remarks 
to  modelling  issues  in  Bayesian  statistics. 
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MODELS  FROM  BAYESIAN  STATISTICS 
A  Controversial  Paradigm 

I  would  like  now  to  discuss  a  statistical  methodology  which 
has  proven  to  be  an  especially  rich  source  of  actuarial  models 
over  the  past  20  years — I  refer  of  course  to  the  Bayesian  revolu¬ 
tion  which  was  foreseen  by  de  Finetti  in  1957  [D3] ,  and  developed 
by  L.  J.  Savage  [SI] ,  D.  V.  Lindley  [Lll]  and  many  others  in¬ 
terested  in  applications  of  statistics. 

It  would  take  several  days  of  lecturing  to  cover  the  philo¬ 
sophical  implications  of  the  Bayesian  approach  and  to  contrast 
it  with  traditional  methods;  there  are  many  advocates  who  are 
much  more  qualified  for  this  task  than  I — see,  e.g.  [Bl,  B3 , 

B4,  C3,  E2,  E4 ,  Lll,  L13,  L14 ,  SI].  The  "controversial"  nature 
of  the  Bayesian  approach  seems,  in  my  opinion,  to  be  related  to 
a  personal  worldview  of  many  professional  statisticians,  namely 
that,  as  specialist  consultants,  they  are  not  a  priori  permitted 
to  have  any  opinion  about  the  real  problem  at  hand,  but  "the 
data  must  speak  for  itself."  This  attitude  leads  to  a  variety 
of  ingenious  theoretical  constructions,  which  unfortunately  can 
in  many  cases  be  shown  to  have  very  poor  conditional  properties 
or  exhibit  a  lack  of  coherence  to  the  laws  of  probability  [B4, 
Lll,  R4 ] . 

Fortunately,  the  issue  of  whether  the  analyst  has  any  prior 
or  collateral  information  about  the  problem  at  hand  is  hardly 
a  difficulty  for  the  actuary  or  the  engineer.  As  Norberg  points 
out  (emphasis  added) : 


" . . .  in  class-rating  situations  the  actuary  must  to  some 
extent  rely  on  subjective  judgement  since  the  rate-making 
decision  is  forced  upon  him  right  here  and  now  and  cannot 
be  deferred  or  put  off .  If  he  wants  to  remain  in  busi¬ 
ness,  he  should  not  tell  the  client  that,  'Your  new  gas- 
tanker  is,  of  course,  a  most  interesting  object  of 
insurance,  and  we  look  forward  to  negotiate  the  terms 
as  soon  as  the  hazard  can  be  assessed  from  objective 
facts,  say  in  10  years  or  so.'"  [N4] 


This  compulsion  to  a  decision  is  so  embedded  in  historical 
insurance  underwriting,  it  seems  difficult  to  imagine  a  philo¬ 
sophical  discussion  in  London  coffeehouses  about  whether  or  not 
to  permit  subjective  judgements.  Of  course,  coherent  actuaries — 
ones  who  agree  to  use  the  laws  of  probability — may  disagree  on 
the  probabilities  to  be  associated  with  certain  unknown  random 
quantities,  since  they  do  not  share  the  same  training  and  ex¬ 
perience  in  the  real  world,  i.e.,  their  current  information  states 
differ . 

But  the  advantage  of  the  Bayesian  paradigm  is  that  it  pro¬ 
vides  a  mechanism  for  the  orderly  sharing  and  rationalization  of 
this  information,  both  when  making  the  initial  underwriting 
decision,  and  later,  as  experimental  facts  are  accumulated. 

And  this  is  the  point  I  would  like  to  emphasize  [L14] : 

"The  Bayesian  approach  to  statistics  is  a  complete, 
logical  framework  for  the  discussion  and  solution  of 
problems  of  inference  and  noncompetitive  decision¬ 
making  . " 


Thus,  in  addition  to  providing  a  methodology  for  estimation, 
prediction  and  decision-making  [Al] ,  as  in  Subject  2  of  this 
Congress,  it  also  helps  the  model -building  process,  since  the 
scientist  is  forced  to  make  explicit  all  of  his  underlying 
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assumptions  about  the  influences  of  one  random  quantity  upon 
another,  especially  which  quantities  are  conditionally  independent 
of  each  other,  and  what  is  his  relative  uncertainty  about  as- 
yet-unobserved  quantities  (his  current  information  state) .  This 
"warts  and  all"  specification  of  the  model  is  somewhat  embarass- 
ing  when  first  attempted,  since  it  prevents  the  subterfuge,  tacit 
assumptions,  and  "ad-hocery"  which  sometimes  characterize  tradi¬ 
tional  estimation  methods. 

And,  as  suggested  above,  the  requirement  for  complete  speci¬ 
fication  permits  scientists  who  dissent  on  underlying  probabilities 
or  on  the  exact  form  of  a  model  relationship  to  communicate  in 
an  orderly  manner  about  their  differences,  to  explore  the  conse¬ 
quences  of  their  differences,  and  to  rationalize  these  differences 
or  change  their  minds,  in  the  face  of  experimental  data.  At  any 
point  in  the  analysis  they  are  free  to  make  approximations  or  use 
appealing  empirical  methods,  and  it  will  be  apparent  to  all 
exactly  what  has  or  has  not  been  "swept  under  the  rug." 

I  would  like  to  try  and  illustrate  some  of  these  points  with 
reference  to  traditional  insurance  models,  approached  from  the 
Bayesian  point  of  view. 

Life  Table  Analysis  under  Competing  Risks 

Consider  the  classical  problem  of  estimating  mortality  rates 
in  a  life  table  with  two  decrements,  death  and  withdrawal.  For 
simplicity,  consider  the  single  age  interval  (0,1]  in  which  we 
observe  that  N  =  N  "starters"  at  time  t  =  0  have  resulted  in 
D  *  D  deaths  in  service,  W  =  W  withdrawals  while  alive,  giving 
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then  E=E=N-D-W  "enders"  at  t  =  1  .  Making  the  usual 
modelling  assumption  that  mortality  and  withdrawal  are  independent 
competing  risk  processes ,  with  the  (continuous)  age  cumulative 
distribution  functions: 

Pr  (age  at  death  <_  t}  =  PQ(t)  ;  Pr  {age  at  withdrawal  <_  t}  =  Pw(t) 

defined  over  t  in  ( 0 ,  oj ]  ,  the  problem  is  to  estimate  the 
absolute  rates  of  decrement  for  this  interval, 

=  PD  ^  =  PW(1^  ' 

from  the  data  (N,D,W,E)  .  The  estimate  of  qQ  should,  of  course, 

be  greater  than  D/N  because  the  observed  death  data  did  not 
include  those  in  W  who  died  after  withdrawal. 

The  traditional  approach  to  this  problem  is  to  make  some 
additional  special  assumptions  about  the  forms  of  the  cdfs 
Pq ( • )  ,  pw(‘)  (or  their  associated  failure  rates);  for  example, 

if  the  decrements  are  assumed  to  be  uniformly  distributed  over 
(0,1]  ,  we  obtain  the  familiar: 

qD  -  D/(N  -  %W)  ;  qw  =  W/(N  -  %D)  . 

A  difficulty  with  these  estimators  is  that  they  are  inconsistent, 
in  that  they  need  not  approach  the  true  absolute  rates,  as 
N  -*>  ®  . 

Lindley  [L12]  points  out  that  all  of  the  information  con¬ 
tained  in  the  data  is  given  by  the  likelihood: 

L<VV  =  (VD(VW(1  -  71  d  -  VN'D‘W  ' 
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where  irD  and  tt^  are  the  observed  probabilities  of  death  and 
withdrawal  under  competing  risks,  respectively: 

1  1 

ttd  =  J  [i  -Pw(t)]dPD(t)  ;  irw  =  J  [1  -PD(t)]dPw(t)  ;  ttd  +  irw  =  qDqw  . 
0  0 

At  first  glance,  it  appears  from  the  likelihood  that  a  Bayesian 
analysis  would  now  require  some  prior  probabilities  on  ttd  and 
ttw  to  proceed.  However,  since  the  desired  estimators  are  for 

qQ  and  qw  ,  it  makes  sense  to  use  the  above  relationships  to 

eliminate  the  observed  probabilities  in  favor  of  the  absolute 
rates  of  decrement  insofar  as  possible.  By  defining 

1 

0 

the  likelihood  can  be  rearranged  into: 

L(qD,qw,r)  =  (qD)D(l  -  qD)E(qw)W(l  -  qw)E(l  -  rqD)W(l  -  (l-r)qw)D. 

The  additional  "nuisance  parameter,"  r  ,  indicates  explicitly 
what  additional  (dependent)  information  is  contributed  to  the  model 
by  PD(-*)  and  Pw(.*);  or  stated  another  way,  the  complete  forms 
of  the  age  cdfs  are  irrelevant,  and  only  the  three  parameters 

r  t  ,  and  q^  ,  should  enter  into  the  estimation  process. 

Notice  that  if  the  Bayesian  analyst  had  well-developed  ex¬ 
perience  regarding  these  three  parameters  prior  to  the  experiment, 
he  would  now  be  able  to  compute  a  posterior-to-data  distribution 
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for  qD  and  qw  in  the  obvious  application  of  Bayes'  law, 
even  if  he  specified  that  the  parameters  were  dependent . 

What  about  r  ?  A  physical  interpretation  is  somewhat 
elusive;  it  is  essentially  the  conditional  probability  of  death 
"winning  out"  over  withdrawal,  given  that  both  death  and  with¬ 
drawal  will  occur  in  (0,1]  .  But — and  here  is  the  surprising 

part — it  turns  cut  that  under  many  of  the  usual  additional 
assumptions  made  in  mortality  processes,  r  will  have  a  value 
at  or  near  0.5,  for  example,  under  constant  failure  rates  or 
with  linear  splines  used  for  the  cdfs.  In  other  words,  after 
further  additional  modelling  assumptions,  r  seems  to  be  near- 
deterministic,  in  the  sense  that  it  has  a  very  "tight"  prior 
around  rQ  =  0 . 5  ,  or  some  other  natural  value  given  by  valuation 
calendar  assumptions;  it  is  thus  hardly  relevant  whether  r  is 
correlated  with  qQ  and  qw  ,  a  priori. 

Lindley  [ Ll 2 ]  carries  out  additional  numerical  analyses  which 
show  that,  when  N  ,  D  ,  and  W  are  large,  most  of  the  information 
is  carried  by  the  likelihood,  rather  than  the  prior,  and  that 
the  mode  of  the  posterior-to-data  density  of  qD  is  given  by 

qD  =  D/(N  -  (1  -  rQ)W)  , 

thus  validating  traditional  estimators.  But,  even  more  satis¬ 
factorily,  it  is  possible  to  obtain  estimates  of  the  variance 
of  the  density  of  qD  ,  which  helps  to  understand  what  is  a  "large" 
data  set.  It  is  also  possible  to  address  the  question  of  whether 
or  not  this  estimator  is  consistent  as  N  -*■<»;  it  turns  out  that 


it  is  impossible  to  eliminate  the  (admittedly  small)  original 
uncertainty  about  r  ,  no  matter  how  large  the  data  set. 

(The  same  problem  is  tackled  by  classical  maximum  likelihood 
methods  in  [H14].) 

Graduation 

If  we  have  only  mortality  effects  (W  =  0)  ,  then  grouped 
data  still  exhibits  significant  fluctuation.  One  possible  approach 
to  the  problem  of  estimating  mortality  rates  is  to  apply  Bayesian 
ideas  directly  in  graduating  (smoothing)  raw  observed  mortality 
rates  [J22,  H13] .  In  my  opinion,  this  approach  reveals  immedi¬ 
ately  (through  the  explicitness  of  the  assumptions  required  in  the 
Bayesian  methodology)  a  major  modelling  difficulty — the  graduator 
must  specify  a  great  deal  of  prior  information  about  the  co- 
variance  structure  of  the  random  parameters  associated  with  each 
age  interval,  assumed  to  be  multinominally  distributed.  Under 
certain  additional  assumptions,  and  the  use  of  a  clever  trans¬ 
formation  to  obtain  an  approximately  normal  likelihood  of  the 
correct  conjugate  type,  [ Hi 3 3  obtains  smoothed  rate  estimates 
reminiscent  of  multidimensional  credibility  theory  [J6,  J9]  . 

It  should  be  emphasized  that  the  difficulty  here  is  not  in 
the  use  of  Bayesian  methodology,  but  rather  in  the  fact  that 
the  King-Whittaker-Henderson  free-form  graduation  and  smoothing 
techniques  are  rather  too  tightly  posed;  either  one  believes 
specifically  in  the  f it-versus-smoothness  objective  and  accepts 
the  traditional  machinery,  or  one  must  make  a  larqe  number  of 
supplementary  probabilistic  assumptions  about  the  way  in  which 
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the  data  roughness  is  generated.  In  the  latter  case,  it  is  clear 
that  focusing  the  attention  of  the  modellers  upon  the  dependency 
assumptions  about  roughness  in  adjacent  intervals  is  extremely 
useful  in  clarifying  the  model.  This  structured  argumentation 
also  gives  insight  into  the  classical  procedures  and  into  other 
proposals  for  automatic  graduation  (see  the  discussions  in  [G15 , 
H13,  S5]).  Of  course,  it  does  not  help  much  in  more  artistic 
ad  hoc  approaches. 

Mortality  Law  Models 

In  many  reliability  problems  [J16]  and  in  the  comparison  of 
certain  mortality  tables  (see,  e.g.  [80.45,  W3] ) ,  it  may  be 
possible  to  make  a  stronger  assumption  about  the  form  of  mortality 
versus  age,  and  leave  our  uncertainty  associated  with  certain 
free  parameters.  For  example,  if  we  assume  that  the  shape  of  the 
failure  rate  is  known  as  a  continuous  function  of  age,  except  for 
a  scale  parameter,  this  would  give  a  complementary  distribution 
function 


Pr  {remaining  lifetime  >  t}  =  exp  [-0Q(t)]  , 

where  9  is  the  unknown  parameter,  and  Q(*)  is  the  prototype 
cumulative  hazard  function ,  for  example,  Gompertz '  form.  If  one 
assumes  that  prior  information  about  values  of  8  can  be  expressed 
in  terms  of  a  Gamma  density,  this  leads  to  particularly  simple 
Bayesian  updating  formulae. 

In  [J19] ,  the  author  applies  this  model  to  the  adaptive 
modification  of  life  contingency  premium  reserves,  assuming  that 
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the  underwritten  cohort  consists  of  lives  which  have  the  above 
lifetime  distribution,  but  in  which  there  is  a  common  value  of 
9  ,  selected  from  a  larger  collective  of  lives  under  surveillance 
by  the  insurance  company.  For  example,  just  after  underwriting 
at  common  age  x  ,  the  actuary  would  examine  all  of  his  prior 
knowledge  about  the  variation  of  9  over  previously  insured, 
similar  cohorts,  and  would  set  an  initial  reserve  at  the  a  priori 
average  value  E{  V  (9)}  ,  where  .V  (9)  is  the  deterministic  le- 

O  X  C  X 

gal  reserve  at  age  t  +  x  for  the  assurance  to  x  ,  given  9=9. 
As  time  passes,  expirations  in  this  cohort  will  occur  randomly, 
changing  our  posterior  estimate  of  9  ,  as  more  or  less  expira¬ 
tions  than  expected  occur  (this  model  also  has  the  property  that 
it  uses  information  about  the  lives  still  in  existence  at  time 
t) .  The  correct  Bayesian  adaptive  reserve  per  contract  at  time 
is  then  I  Expiration  History  in  (0,t]>  ,  obtained  through 

routine  use  of  the  updating  formula.  This  generates  a  curve  which 
drifts  (upward  for  annuities,  downward  for  assurances)  through 
the  family  of  classical  reserve  curves  when  no  one  dies,  then 
jumps  (down  or  up)  at  the  random  instant  of  death;  naturally, 
with  a  large  number  of  lives  in  the  cohort,  it  seeks  out  and  tends 
to  follow  the  curve  corresponding  to  the  correct  9  for  this 
cohort.  A  similar  model  could  of  course  be  useful  in  other  life 
insurance  problems,  for  example,  in  the  valuation  of  a  portfolio 
for  reinsurance  purposes.  ((76.3]  suggests  a  credibility  approach 
to  group  terra  life  insurance.) 

There  seem  to  be  few  other  Bayesian  models  in  life  assurances; 
however,  it  is  a  natural  approach  for  pension  systems  studies, 
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most  of  which  are  still  dominated  by  the  deterministic-plus 
approach.  Shapiro  studies  the  adequacy  of  projected  retirement 
costs  in  a  pension  model  in  which  persistency  of  participants, 
mortality  rates  and  interest  factors  are  random  variables  whose 
parameters  are  also  stochastically  distributed  [Sll,  S12,  S13]  . 

The  projected  probability  distribution  of  final  accumulation  given 
could  be  updated  dynamically  by  Bayesian  methods  through  popula¬ 
tion  experience. 

Credibility  Theory 

The  most-developed  Bayesian  model  in  insurance  must  be  credi¬ 
bility  theory;  surely  you  have  all  heard  something  about  it 
already,  good  or  bad.  In  its  simplest  form,  we  imagine  a  risk 
collective,  similar  to  that  already  described,  in  which  each  risk 
is  characterized  both  by  his  outcome  random  variable ,  x ,  and  by 
an  unknown  risk  parameter ,  9  ,  which  describes  his  particular 

risk  variability  within  the  collective;  the  model  for  the  outcome 
is  the  simple  likelihood  density  p(x  j  8)  and  u(0)  is  the 
structure  function  for  the  distribution  of  8  over  the  collective. 

For  the  selected  risk,  we  cannot  observe  his  risk  parameter 
9  directly,  but  we  can  record  his  experienced  outcomes 
x  =  (x^,X2,  . ..,  xn)  over  n  years  of  experience.  By  a  direct 
application  of  Bayes'  law,  then,  it  is  possible  to  update  our 
estimate  of  the  risk's  parameter  distribution  from  u(9)  to 
u(9  |  x)  .  In  experience-rating  applications,  the  basic  problem 
is  to  predict  the  value  of  the  unknown  future  outcome  xn+^  , 
using  both  the  experience  x  for  this  risk,  together  with 
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information  about  the  collective.  This  emphasis  on  future  values 
of  observables,  rather  than  on  unverifiable  values  of  parameters , 
is  typical  of  practical  applications,  but  has  only  recently 
begun  to  receive  its  proper  emphasis  in  Bayesian  statistics 
[Al]  . 

Perhaps  the  most  fascinating  part  of  the  credibility  story 
is  that  the  practical  importance  of  this  problem  was  recognized 
by  the  American  actuaries  A.  H.  Mowbray  and  A.  W.  Whitney  over 
60  years  ago,  when  statistics  was  in  its  infancy.  They  proposed 
a  formula  which  we  would  write  in  the  above  notation  as: 

xn+1  =  (1  -  Z)m  +  Zx  ;  x  =  [  xt/n  ;  Z  =  n/(n+nQ)  . 

x  is,  of  course,  the  average  observed  experience  of  this  particu¬ 
lar  risk,  and  will,  as  experience  accumulates,  be  a  "good"  esti- 
mator  of  xn+^  >  however,  these  pioneers  reasoned,  with  small 
amounts  of  experience  data,  x  may  be  a  highly  variable 
estimator — why  not  mix  it  with  the  manual  premium,  m  ,  which 
is  already  tabulated  for  the  collective  as  a  whole? 

Using  heuristic  reasoning  based  upon  pooling  of  data  argu¬ 
ments,  they  argued  that  the  mixing  coefficient,  Z  ,  which  they 
called  the  credibility  factor ,  should  be  of  the  form  indicated 
above,  with  the  time  constant,  nQ  ,  chosen  from  experience. 

The  next  part  of  the  story  comes  in  the  1950 's,  just  at 
the  beginning  of  the  resurgence  of  interest  in  the  Bayesian 
approach,  when  A.  L.  Bailey  [Bl] ,  and  A.  L.  Mayerson  [M3]  showed 
that  the  experience  rating  problem  could  be  posed  as  the  problem 
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of  estimating  a  posterior-to-data  mean,  ^xn+i  I  >  which  is 

already  implied  in  our  collective  model  above.  They  carried 
out  this  calculation  for  several  important  likelihoods  p(x  |  9) 
and  structures  u(9)  ,  and  showed  that  the  linear  credibility  form 
was  exact,  even  to  its  dependence  on  n  and  the  interpretation 
of  the  manual  premium  as  m  =  EE{x  |  9}  .  The  time  constant  nQ 
was  a  function  of  the  (hyper ) parameters  of  the  structure  function. 

A  good  summary  of  this  "American  school  of  partial  credibility" 
can  be  found  in  [L18 ] . 

The  scene  now  shifts  to  Switzerland,  where  Biihlmann  showed 
in  1967  [B36]  that,  if  one  choose  to  approximate  E{xn+1  I  x^  by 

a  linear  function  of  the  observed  data,  say  a  +  bx  ,  then  the 

least-squares  estimate  was  again  the  familiar  credibility  form, 
only  now  there  was  an  interpretation  for  nQ  as  well  as  m  . 

More  precisely,  if  D  and  E  are  the  two  components  of  collec¬ 
tive  variance  previously  defined,  then  nQ  =  E/D  .  This  result 

gave  impetus  to  a  number  of  works  from  E.  Straub  and  the  rest 

of  the  "Swiss  School." 

It  also  stimulated  the  author  to  think  in  1973  about  various 
extensions  of  the  basic  model,  and  also  to  examine  the  link  be¬ 
tween  the  exact  linear  formulae  of  Bailey  and  Mayerson,  and  the 
approximate  results  of  Biihlmann.  Now,  it  is  known  from  statistics 
that  if  the  sample  mean  is  the  only  sufficient  statistic  (apart 
from  n)  for  the  parameter  9  in  a  sequence  of  i.i.d.  trials 
(and  if  the  range  of  x  is  fixed,  and  certain  technical  regular¬ 
ity  conditions  are  met) ,  then  the  likelihood  must  be  one  of  the 
members  of  the  Koopman-Pitman-Darmois  exponential  family,  i.e.. 
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p(x  I  6)  =  ^(xi  -exP  -(-~-9x) 
p'  1  c ( 8 ) 

over  some  appropriate  range  and  measure  for  x  ,  making  the  likeli¬ 
hood  of  the  experience  data  x  =  (x1,x2,  xfl)  equal  to 

Ha  (x  )  exp  (-8nx) 

L(9)  *  p  (x  |  9)  - - ^ - - -  . 

[c (8)  ] n 

This  family  includes  many  of  the  favorite  models  in  insurance, 
such  as  the  Poisson,  the  Binomial,  the  Exponential  and  the  Gamma 
with  fixed  shape,  the  Normal  with  fixed  variance,  etc.  Now,  if 
we  simultaneously  assume  that  the  collective  structure  function 
(e.g.  the  Bayesian  prior  density  on  8)  is  the  so-called  natural 
conjugate  prior 


u(9)  «  [c  ( 9 ) ]  °  exp  ( -8xq )  , 

(again,  over  some  natural  range  for  8) ,  then  it  is  extremely 
easy  to  form  the  posterior-to-data  density  of  the  structure 
function  for  the  risk  under  study,  u(9  |  x)  ;  essentially,  it  is 
of  the  same  form  as  u{8)  ,  but  with  the  hyperparameters  nQ  and 
xQ  replaced  by  nQ  +  n  ,  and  xQ  +  x  ,  respectively.  This  closed- 
under-  sampling  property  is  extremely  convenient,  and  for  many  of 
the  practically  useful  likelihoods  gives  also  a  convenient  prior, 
such  as  the  Gamma,  or  Normal  [Al,  J5] .  So  far,  so  good.  The 
bonus  comes  after  making  a  regularity  assumption  which  is  always 
satisfied  in  practice,  whence  we  find  [J6,  J8]  that  these  conjugate 
families  of  densities  imply  that  E{xn+^  I  ~  (xQ  +  x)/(nQ  +  n) 
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exactly!  In  other  words,  the  hyperparameter  nQ  is  just  the 
ratio  E/D  found  by  Biihlmann,  and  the  hyperparameter  xq  =  nQm  . 
So,  now  we  know  that  the  linear  credibility  rule  is  exact  for  a 
rather  wide  class  of  important  distributions,  and  there  are 
further  indications  that  it  is  extremely  robust  in  many  other 
situations  as  well. 

Well,  since  that  time  there  has  been  a  virtual  explosion 
of  research  into  various  extensions  and  mathematical  properties 
of  linearized  Bayesian  credibility  models.  I  hope  that  my  many 
friends  in  the  American-Swiss-Belgian-Australian-Italian- 
Portugese-Norwegian-plus  School  of  Credibility  will  excuse  me 
if  I  do  not  try  to  enumerate  and  compare  all  their  various  con¬ 
tributions.  1974-1975  saw  the  first  monograph  [D8] ,  as  well  as 
the  first  research  conference  [K3]  on  credibility;  a  1976  biblio¬ 
graphy  [D7]  lists  141  items,  to  which  must  be  added  at  least 
another  30  recent  contributions. 

Among  the  types  of  different  models  which  have  been  developed, 
we  might  mention: 

(i)  the  problem  of  claim  frequency  and  severity 
(see  below) ; 

(ii)  the  effect  of  premium  volume  C B4 6 ]  ; 

(iii)  estimation  of  extreme  values  and  probabilities 
[B39,  D21 ,  Fll,  J 4 ,  P7,  S18] ; 

(iv)  minimax  estimators  [Ml,  S18] ; 
other  sufficient  statistics  and  "best"  estimator 


forms  (DIO,  D17,  J6] ; 


(v) 
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seasonal  and  other  nonstationary  models  [G6, 

J9 ,  K3,  S22] ; 

hierarchical  models  [D19,  Jll,  S24,  T1 ,  T12 ,  Z 4 ] ; 
multidimensional  models  [Jl,  J6,  J9,  J10] ; 
conditionally  linear  models  [B46,  J9] ; 
regression  and  inverse  regression  models  [HI, 

J12,  J13 ,  J14 ,  S24] ; 

general,  abstract,  and  " nonpar ametric"  models 
[HI,  L15,  S25 ,  T5,  Zl,  Z2,  Z3,  Z4] 

and  many  other  special  topics,  such  as:  the  influence  of  different 
risk  factors  [A3];  rate  classification  [Dll,  W2] ;  bonus  hunger 
[N2,  W3] ;  network  flows  [ Jl 5 ] ;  reliability  [J16,  J19] ,  etc. 

It  is  difficult  to  get  a  perspective  on  the  field  at  this  point; 
my  survey  [J17]  is  already  out  of  date,  a  more  recent  one  is 
given  by  Norberg  [N4] .  The  only  obvious  trend  is  that  the 
Scandanavian  Actuarial  Journal  seems  to  have  taken  the  lead  over 
the  ASTIN  Bulletin  and  the  Swiss  Mitteilungen  in  publishing 
articles  of  this  type  I 

It  should  also  be  pointed  out  that  linearized  Bayesian  for¬ 
mulae  are  constantly  being  developed  in  other  fields;  for  example, 
in  communications  theory,  similar  results  arise  in  Wiener-Kalman 
filter  theory,  where  the  emphasis  is  on  adaptive  updating  for¬ 
mulae  for  nonstationary  processes.  There  have  also  been  a  number 
of  related  articles  in  statistics,  beginning  with  [ E 6 ,  HS] ,  see 
also  references  in  [J17] ;  unfortunately,  these  do  not  often 
reference  or  acknowledge  priority  from  credibility,  in  part. 
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because  of  the  limited  circulation  of  actuarial  journals.  Two 
recent  statistical  works  which  do  reference  credibility  are 
[D20,  FI]. 

The  Problem  of  Severity 

The  economy  and  simplicity  of  Bayesian  model-building  can 
be  illustrated  by  considering  the  problem  of  estimating  the  total 
severity  of  a  casualty  claim;  this  problem  has  previously  been 
analyzed  by  Hewitt,  Biihlmann,  and  the  author  using  credibility 
theory  [B38,  H10,  Jl] .  Extending  the  compound  claim  process  des¬ 
cribed  earlier,  we  suppose  that  in  time  period  t  (t  =  1, 2,  .  . . ,  n) 
a  random  number  of  claims  k^  occur,  with  random  individual  claim 
amounts  £xtl'xt2'  *  *  *  '  xtk  J  '  w^ose  total,  st  ,  a  random  sum 

of  random  variables,  is  the  total  severity  for  this  time  period. 

We  imagine  that  the  number  of  claims  is  governed  by  an  unknown 
parameter  4>  ,  and  the  claim  size  by  an  unknown  parameter  9  , 
which  are  fixed  for  an  individual  risk,  but  whose  structure  dis¬ 
tribution  density,  u(9,4>)  is  known  over  the  collective.  As 
usual,  we  assume  that,  given  the  parameters,  the  individual  claim 
amounts  and  the  number  of  claims  are  mutually  independent,  and 
that  all  the  claim  amounts  are  identically  distributed.  The  prob¬ 
lem  is  to  estimate  the  total  severity  next  period  of  this  particu¬ 


lar  risk,  sn+]_  •  given  the  underlying  probability  laws,  and  the 
observed  data  V  =  [ki'k2'  •••>  kn  ;  xu'x12'  •••'  Xikx  ; 

xnl ' xn2 '  **•'  Xnkn]  ' 
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Suppose  we  assume  that  individual  claim  size,  given  9  ,  is 
modellable  by  a  density  for  which  the  sample  mean  claim  is  the 
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only  sufficient  statistic  and  that  the  claim  frequency,  given  4  , 
is  also  modellable  by  a  density  for  which  the  mean  claim  rate  is 
the  only  sufficient  statistic;  these  are  strong  assumptions,  but 
are  satisfied  by  many  of  the  particular  models  used  in  the  litera¬ 
ture.  This  means  that  the  two  basic  model  laws  for  x  and  k 
belong  to  the  simple  exponential  family,  say: 

p(x  |  9) - cTfy -  ?  P<k  I  - d it) -  • 


(These  are  different  densities;  we  are  using  the  usual  Bayesian 
trick  of  letting  the  arguments  describe  the  form,  range,  and 
measure  of  the  (possibly  discrete)  density,  rather  than  defining 
new  functions.)  Now,  only  from  the  assumptions  above ,  we  can  show 
that  this  implies  the  likelihood  of  observing  the  data  V  must  be 


L  (Q  ,<p)  *  p(V  |  0,4)  = 


n 

H  a  Ms 


t=l 


t)b(kt)] 


exp  (-9s.  -  4>k . ) 


[c  (9 )  ]  [d (4, )  ] 1 


where  the  asterisk  indicates  convolution,  and 


k.-I  kt  ;  *.-E  *t-EI  xtl  , 


are  the  total  number  of  claims  and  total  severity  observed  over 
all  the  time  periods. 

Clearly  the  term  in  square  brackets  in  the  data  likelihood 
can  be  ignored,  and  it  follows  that  the  new  data  V'  =  [k.,s.,n]  , 
the  total  number  of  claims,  the  total  severity,  and  the  number  of 
time  periods,  are  sufficient  for  (9,4)  ;  in  other  words,  any  addi 
tional  combination  or  arrangement  of  the  data  is  noninformative. 


jj* 


Now,  if  as  Bayesians,  we  were  able  to  completely  specify 
the  joint  prior  density,  u(0,<j>)  /  then  an  analytic  or  numerical 
application  of  Bayes'  law  would  update  this  prior  to  u(0,$  \  V)  , 
and  this  could  be  used  to  get  a  predictive  density  for  sn+]_  >  ^-n 
principle. 

But  this  calculation  appears  difficult  in  the  general  case, 
and  so  we  look  for  simplifications.  If  we  consider  predicting 
only  the  mean  severity  next  period  (the  experience-rated  fair 
premium) ,  it  can  be  shown  that: 


E{sn+1  I 


c'  (0 )  d '  (4>  +  in  c  (0 ) ) 
c  (0 )  d  (<j>) 


and  this  function  might  be  easier  to  integrate  than  the  complete 
predictive  density  with  particular  choices  for  the  underlying 
models. 

Notice  that  we  have  not  yet  assumed  that  the  parameters 
assumed  are  independent;  if  they  are  dependent,  this  means  that 
observed  (mixed)  data  from  the  collective  will  show  a  correlation 
between  average  frequency  and  average  claim  size — which  is  some¬ 
times  observed  in  automobile  statistics. 

If  we  are  willing  to  additionally  make  the  assumption  that 
the  parameters  are  independent,  u(0,<£)  =  u^(0)u2(<J>)  ,  say,  and 
that  the  natural  conjugate  priors  to  the  basic  likelihoods  are 
reasonable  choices  for  u^  and  u2  ,  then  we  get  almost  immedi¬ 
ately  that  E{sn+^  I  is  exactly  the  product  of  two  credibility 

formulae,  one  for  claim  amounts  and  one  for  frequency!  In  fact 
the  independence  assumption  has  often  been  made  in  the  literature. 
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If  as  Bayesians,  we  feel  that  these  modelling  assumptions 
are  too  restrictive,  we  are  free  to  make  a  credibility  approxi¬ 
mation  to  the  desired  mean,  by  assuming  that  it  is  a  linear 
function  of  the  data.  But,  from  the,  likelihood,  we  see  that 
there  are  only  two  sample  variables  of  interest,  s.  and  k# 

(or  their  corresponding  sample  averages  s#/n  and  k#/n) .  The 
corresponding  2-dimensional  credibility  p:  adiction  in  the  corre¬ 
lated  case  has  been  carried  out  in  [Jl] ;  [B38]  shows  the  expected 
result  that,  in  the  uncorrelated  case,  the  credibility  forecast 
factors  into  the  product  of  two  independent  forecasts  for  amount 
and  frequency.  I  freely  admit  that  in  using  credibility  this  way 
we  are  not  obtaining  exact  results,  but  only  approximations;  how¬ 
ever,  by  starting  out  with  a  full  Bayesian  analysis,  we  at  least 
"know  where  the  cards  lie."  If  the  result  that  only  st  and  k. 
need  be  used  seems  suspicious,  we  must  change  the  basic  likelihoods 
to  more  appropriate  forms,  and  not  blame  dependence  of  the  para¬ 
meters  . 

Daboni  [76.1]  has  adopted  an  interesting  approach  to  the 
problem  of  correlation  in  severity  which  illustrates  the  Bayesian 
approach  to  model  uncertainty  through  the  method  of  model  mixtures 
(J19,  Lll,  L14].  Essentially,  he  proposes  to  use  a  prior  density 
of  the  form: 


v(9,4>)  =  7tui;l  (9)  u21  {<(>)  +  (1  -  ir)u12(9)u22(<j>) 

-  .  the  initial  choice  of  the  mixing  coefficient  tt  and  the 
-  v-eters  for  the  four  (conjugate)  priors,  u^  and  u^2  , 

,  are  based  upon  observable  means,  variances,  and 
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covariances  for  the  collective.  The  hyperparameters  for  the 
priors  are  updated  in  the  usual  simple  fashion,  but  now  there  is 
a  rather  complex  updating  of  the  mixing  coefficient,  from  tt  to 
a  data-dependent  it(£>)  .  This  approach  is  often  used  when  there 

is  uncertainty  about  the  prior  model  form.  The  only  philoso¬ 
phically  unsatisfactory  thing  about  this  particular  model  is  that, 
in  order  to  restrict  the  number  of  hyperparameters  to  be  estimated, 
the  shape  parameters  of  the  four  priors  are  constrained  in  a 
special  way.  The  paper  does  show  that,  as  n  approaches 
infinity,  the  exact  nonlinear-mixed-linear  mean  forecast  does 
factor  into  the  product  of  two  independent  linear  forecasts, 
since  (aha!  now  we  remember  our  basic  assumption)  claim  amounts 
and  frequencies  are  independent,  given  (9 ,4>)  ,  and,  in  the  usual 
Bayesian  way,  we  may  expect,  for  increasing  data  sets,  to  have 
the  posterior  density  approach  a  degenerate  density  at  the  true 
values,  with  probability  one.  Furthermore,  there  are  ways  within 
the  Bayesian  paradigm  to  estimate  this  convergence! 

Prior  Estimation  and  Empiricism 

In  closing  this  section  on  Bayesian  modelling,  I  would  like 
to  respond  to  Norberg  [N4] ,  who  implies  that  I  am  clearly  of 
the  genuine  Bayesian  persuasion,  even  if  I  do  not  display  the 
colours,  and  wonders  whether  I  would  surrender,  or  use  some  em¬ 
pirical  approach  if  a  prior  density  u(6)  were  not  available. 

Well,  as  a  philosopher,  I  am  a  good  engineer;  I  find  it  difficult 
to  imagine  that  I  would  not  have  some  prior  opinion,  based  upon 
observation,  about  almost  everything  in  our  real,  physical  world 
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of  man.  But  I  am  also  of  an  empirical  rather  than  abstract  bent, 
and,  heeding  Occam's  razor,  would  much  prefer  a  simple  model  of 
known  limitations  and  verifiable  dimension,  to  a  more  complex, 
untested  model  which  may  be  "more  accurate."  Thus,  when  being 
empirical,  I  prefer  to  make  explicit  the  simplifying  assumptions 
and  computational  shortcuts  I  am  taking,  in  order  to  be  able 
to  make  surprise-free  projections.  And  here,  displaying  my  colours 
at  last,  I  would  like  to  discuss  what  I  think  is  a  potential  heresy 
in  implementing  credibility  theory  under  certain  circumstances: 
the  problem  of  justifiable  empiricism. 

Recall  the  simplest  model  for  predicting  the  mean  value 
m(0)  =  E{x  |  0}  for  a  particular  risk  whose  parameter  8  is 
fixed,  but  unobservable;  to  use  credibility,  we  need  only  three 
moments  from  the  collective: 


m  =  Em(0)  ;  E  =  Ev(0)  ;  D  =  !/m(8)  , 


(in  fact,  we  really  only  need  nQ  =  E/D) .  The  underlying  condi¬ 
tional  moments  m(0)  ,  v(0)  ,  are  not  at  issue,  since  they  come 
from  the  model  likelihood  p(x  |  C)  ,  whose  form  is  always  assumed 
to  be  modellable.  But  taking  the  expectation  and  variance  of 
these  moments  to  find  m  ,  E  ,  D  ,  has  bothered  many  analysts, 
since  it  involves  the  structure  function,  u(0)  .  They  reason 
that,  as  scientists,  they  would  prefer  not  to  have  too  "personal" 
an  opinion  about  this  prior,  but  would  like  to  set  to  work 
immediately,  estimating  m  ,  E  ,  D  from  the  collective  [80.42, 


B46,  D13,  L16,  N4,  Z3,  Z4]. 
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variation  of  these  statistics  about  the  true  values,  for  whatever 
reasonable  range  (and  distribution?)  of  8  values  he  wishes  to 
make  explicit,  so  that  I  can  be  scientifically  satisfied  that  both 
n  and  M  are  "large  enough."  In  fact,  I  expect,  from  my  ex¬ 
perience,  that  "large  enough  data”  implies  that  all  of  the  unbiased 
niceties  in  the  above  formulae  can  be  eliminated,  with  n-1  and 
M-l  replaced  by  n  and  M  ,  respectively,  and  E/n  ignored  re- 

A 

lative  to  D  .  After  all,  the  credibility  formula  is  itself  just 
a  point  estimate ,  and  is  known  through  experience  to  be  quite  robust 
to  choices  of  m  and  E/D  ,  in  the  sense  that  the  data  variability 

noise  overwhelms  any  slight  error  in  the  underlying  conditional 
mean,  when  observing  an  actual  sample  path  of  the  credibility 
statistic,  if  this  variability  in  credible  predictors  is  of 
concern,  then  the  modeller  should  forego  credibility  theory,  and 
make  a  full  distributional  analysis  of  xs,n+1  ,  conditional  on 
X  ,  by  whatever  methods  give  him  satisfactory  results. 

The  extension  of  the  above  heuristic  to  the  case  of  uneven 
data  record  lengths,  n..  (i  =  1,2,  .  . . ,  M)  is  straightforward. 

Now,  to  be  "confident,"  we  would  ask  that  "almost  all"  lengths 
ni  be  large  in  order  to  use  m  ,  E  ,  6  (clearly,  for  the  risk 
s  we  are  trying  to  rate,  ng  could  still  be  small— in  fact, 
this  is  region  in  which  the  credibility  approach  is  most  useful) . 
Incidentally,  the  classical  approach  often  gets  hung  up  on  whether 
or  not  it  is  correct  to  include  xg  in  the  data  X  used  to 
estimate  ra  ,  E  ,  D  ,  given  that  we  have  already  specified  a 
linear  form  in  x3  in  the  remainder  of  credibility  estimator; 
this  seems  to  me  to  be  a  problem  in  theology! 


.W 
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Suppose  that  we  have  data  sets  of  the  same  record  length, 
n  ,  x^  =  *•*'  xin^  from  M  members  of  this  collective, 

indexed  by  i  ,  each  one  of  which  has  an  unobservable  risk  para¬ 
meter  8^  (i  =  1,2,  . . . ,  M)  .  (Note  that  our  notation  is  trans¬ 
posed  from  other  authors.)  We  shall  henceforth  let  index  s 
be  the  particular  risk  that  we  are  trying  to  experience-rate  by 
estimating  E{m(0g)  |  X}  ,  where  X  is  the  total  collective  data 

file  X  =  [x^]  =  [xitJ  . 

The  authors  cited  above  consider  the  unknown  [0^]  to  be 
statistically  mutually  independent,  and  empirically  propose  re¬ 
placing  m  ,  E  ,  and  D  by  point  estimators  of  classical  form, 
viz.  if 

n  M 

x.  -  l  x.  /n  ;  x  =  l  x./M  ; 

1  t-1  i=l  1 

then  use: 

m  =  x  ; 

*  “  M(n  -  1)  ^  (xi(.-xl)  ; 

6  +  £  i  ■  Tm^IT  l  • 

As  an  empiricist,  I  have  no  quarrel  with  this  approach, 
in  fact  it  is  probably  the  one  I  would  use,  because  I  know  that 
these  "unbiased,  minimum  variance"  point  estimators  have  certain 
robust  and  appealing  properties,  especially  when  n  and  M  are 
both  large  and  the  underlying  sums  are  essentially  normally 
distributed.  But  I  would  insist  that  anyone  who  uses  (large) 
sampling-school  results  should  also  be  able  to  bound  the  possible 


The  philosophical  deviation  about  which  I  am  most  concerned, 
however,  is  the  implication  that  estimates  like  the  above,  and 
their  generalizations  to  more  complex  models,  can  be  used  in 
place  of  prior  moments  no  matter  what  the  dimension  of  M  or  the 
n^  .  The  modelling  goal  seems  to  be  to  create  a  formula  in  which 
there  is  no  information  needed  about  u(8)  ,  only  data  X  (and 
xg) .  This  approach  is  called  empirical  Bayes  (eB) ,  and  the  corre¬ 
sponding  formula  the  empirical  credibility  premium. 

The  point  is  further  confused  by  reference  to  the  Empirical- 
Bay  es  (EB)  school,  due  to  Robbins,  Maritz  and  others,  and  to  the 
estimator  of  Stein.  Now  the  EB  approach  is  a  schism  of  a  different 
colour,  and  1  am  not  enough  of  a  philosopher  to  either  defend  it  or 
attack  it.  Essentially,  the  EBers  would  presumably  use  data  X 
of  any  size  to  first  estimate  a  density  u(*)  and  then  use  this 
in  Bayes'  law  to  find  E{m(9)  |  X}  ;  but  I  suppose  we  might 
stretch  the  point  and  say  that  an  EBer  would  also  estimate  m  , 

E  ,  D  directly  from  the  data,  and  use  it  in  a  Bayesian  formula. 

I  do  know  that  EB  computational  machinery  is  complex,  and  that 
it  can  have  other  serious  drawbacks  such  as  incoherence,  poor 
small-sample  properties,  unknown  rate  of  convergence,  or  even 
lack  of  asymptotic  optimality  [D2,  Lll] .  I  am  wary  of  any  help 
from  this  quarter. 

But,  what  about  the  possibility  of  using  the  empirically 
derived  eB  formula  for  arbitrary  n^  and  M — couldn't  this  also 
have  seme  nice,  robust  properties?  Well,  yes,  I  would  have  to 
admit--but  they  haven't  been  demonstrated  yeti  In  other  words. 


if  you  propose  to  "ad  hoc"  up  a  complicated  formula  involving 
both  sums,  squared  sums,  and  sums  of  squares  of  the  data  by 
appeal  to  two  different  schools  of  thought,  then  I  can  only  be 
amazed  at  your  ingenuity.  But  I  suspect  that  you  will  have  a 
difficult  time  in  proving  these  good  properties  analytically, 
and  will  have  to  resort  to,  say,  simulation  (that  is,  to  ex¬ 
perience)  .  Note  that  even  Norberg  [N4]  recommends  that,  for 
small  sample  sizes,  one  should  use  all  of  the  known  information 
about  the  particular  form  of  p(x  |  8)  at  hand  in  setting  up 
empirical  estimators — for  example  using  E  =  x  if  the  outcome 
is  Poisson  with  parameter  9  . 

But  what,  you  persist,  would  I  do  if  I  actually  did  not  have 
much  data  from  the  collective  risks  either?  Fortunately,  there 
is  another  approach  I  could  take  which  would  take  me  back  into 
the  realm  of  acceptable  (to  me)  empiricism,  by  referring  the 
estimation  problem  to  a  higher  level  data  source.  As  Lindley  so 
often  remarks  [L14],  there  are  really  no  completely  unconditional 
statements  in  statistics,  since  our  mathematical  conversation 
can  be  extended  just  so  far,  and  there  will  always  remain  hypo¬ 
theses,  data,  beliefs,  physical  conditions  which  it  is  not  effi¬ 
cient  to  include  in  our  model  unless  our  model  proves  to  be 
unsatisfactory;  contrary  to  popular  belief,  even  a  Bayesian 
may  change  his  mind  about  his  model  after  looking  at  new  data. 

In  this  case,  since  the  Bayesian  approach  does  not  use  the 
collateral  data  [x-  |  i  ^  s]  (because  of  the  independence 
assumption  on  the  9^) ,  I  raise  the  question  as  to  what  tacit 
conditions  were  present  when  I  made  this  assumption?  The  answer 
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is  that  I  assumed  that  the  collective  was  a  relatively  homogeneous 
cohort  of  customers  for  the  same  company.  However,  I  know  also 
from  experience  that  different  insurance  companies  have  different 
collectives  with  widely  different  risk  characteristics,  which  I 
might  parametrize  by  a  company  parameter ,  <j>  .  Furthermore,  I 

know  that  there  often  exist  extremely  large,  albeit  highly 
variable,  statewide  or  nationwide  data  banks  covering  many 
insurance  companies.  By  creating  a  hierarchical  credibility 
model,  I  can  use  not  only  data  from  my  risk  s  and  my  collective, 
but  also  the  "good"  statistics  from  the  collective  of  collectives . 
The  resulting  linear  three-part  credibility  formula  mixes  xg  , 
x  (including  risk  s  data),  and  the  "manual  premium,"  estimated 
from  nationwide  data;  also  to  be  estimated  by  straightforward 
empirical  methods  are  three  components  of  variance  at  nationwide, 
company,  and  individual  levels.  Further  details  may  be  found  in 
[Jll] ;  a  generalization  to  multiple  levels  is  in  [T12] .  [D2] 

generally  exorcises  the  EB  spirit,  for  those  of  you  who  are  still 


of  uncertain  faith. 


RISK-CLASSIFICATION:  A  COMMUNICATIONS  BREAKDOWN 

In  closing  this  survey  of  modelling  in  insurance,  I  would 
like  to  discuss  two  recent  examples  of  conflict  between  the 
traditional  methods  used  for  risk  classification  and  the  goals 
of  society. 

The  first  issue  is  whether  or  not  classification  by  sex 
may  be  legally  used  in  the  U.S.A.  for  defining  pension  benefits. 

The  rationale  for  male-female  differentiation  is,  I  am  sure, 
familiar  to  all  of  you.  By  reference  to  a  life  table,  such  as 
the  one  illustrated  in  Figure  7,  one  sees  that  the  mortality 
rates  are  substantially  different  at  all  ages,  and  one  calculates, 
for  example,  that: 

"If  a  male  and  a  female  employee  reach  age  65  with  an 
accumulation  of,  say  $100,000  each,  and  if  each  elects 
the  Single  Life  Annuity  Option,  the  annuity  income  is 
$11,450  a  year  for  the  man  and  $10,175  a  year  for  the 
woman"  [El] . 

In  a  series  of  rulings  handed  down,  beginning  in  1975,  to  various 
municipal,  state,  and  educational  pension  systems,  various  U.S. 
District  Courts  and  then  our  Supreme  Court  have  determined  that 
sex-classified  annuity  tables  violate  Title  VII  of  our  Civil  Rights 
Act  of  1964,  which  makes  it  unlawful  "to  discriminate  against  any 
individual  with  respect  to  his  compensation,  terms,  conditions  or 
privileges  of  employment,  because  of  the  individual's  race,  color, 
religion,  sex,  or  national  origin."  Of  course,  as  specific  court 
cases,  they  raised  a  variety  of  other  issues,  such  as  whether  the 
mortality  tables  were  representative  of  the  plaintiffs,  whether 
graduation  and  a  set-back  approximation  were  equitable,  the 


FIGURE  7 


Mortality  Rates  -  United  States  Life  Tables  1969-71 
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differences  between  money  purchase  and  formula  benefit  plans, 
and  so  on.  But  the  general  theme  was  that  collective-based 
arguments  for  "cost-based  pricing"  must  be  set  aside  in  favor 
of  individual  considerations  of  equity: 

"The  use  of  group  mortality  tables  attempts  to  predict 
only  the  life  expectancy  of  a  group  and  does  not  consider 
the  individual  female.  These  tables  cannot  predict  the 
life  expectancy  of  any  particular  individual,  regardless 
of  sex.  Since  actuarial  tables  do  not  predict  the  length 
of  any  individual's  life,  any  claim  that  such  tables  may 
be  used  to  assure  equal  pension  benefits  to  males  and 
females  over  their  lifetime,  must  fail"  (quoted  in  [M2]). 

Of  course  the  issue  is  much  broader  than  retirement  annuities, 
and  includes  all  employee  benefits  which  might  involve  discrim¬ 
ination  [76.4,  El,  H7,  K5,  LI,  M2,  ZZl] . 

A  related  social  issue  concerns  classification  schemes  for 
automobile  insurance  [ZZ2,  ZZ3,  ZZ4]  .  As  illustrated  in  Figure  8, 
group-average  accident  rates  vary  significantly  by  the  age  and 
the  sex  of  the  driver.  Although  there  is  no  Federal  law  covering 
access  to  the  highways,  the  freedom  to  use  an  automobile  has 
become  almost  an  inalienable  right  in  the  mass-transportation- 
limited  United  States,  and  the  skyrocketing  cost  of  insurance  is 
viewed  as  a  form  of  social  injustice  by  many  of  our  working  poor, 
particularly  when  ghetto  and  barrio  geographical  territory  rating 
classifications  are  imposed.  The  insurance  commisioners  of 
several  states  have  taken  a  look  at  this  problem,  and  have  come 
to  rather  different  conclusions.  The  basic  philosophical  issue 
is  again  related  to  the  perceived  differences  between  individuals 
in  different  risk  classifications.  As  shown  in  Figure  9,  there 
is  a  significant  overlap  between  the  distributions  of  losses 
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FIGURE  8 

Reported  Accident  Rate  as  a  Function  of  Age, 
California  Motor  Vehicles  Records  1973-8 


between  two  classification  which  have  significantly  different 
pure  premiums,  meaning  that  a  significant  fraction  of  the  "worse" 
classification  will  see  themselves  as  better  drivers  than  a 
large  number  of  the  "better"  classification.  A  related  problem 
is  that  the  process  of  classification  often  produces  low-volume, 
highly-variable ,  and  heterogeneous  high-risk  classes,  whereas 
the  low  risk  classes  are  much  larger  and  more  homogeneous. 

J.  Ferreira,  Jr. ,  has  developed  an  interesting  model  based  on 
utility  theory  which  focuses  on  the  individually  perceived 
inequities  in  being  overcharged ;  by  weighting  overcharges  more 
heavily  than  undercharges,  a  certain  redistribution  of  the  total 
premium  takes  place  which  dramatically  reduces  the  perceived 
inequities  for  the  high-risk  without  affecting  very  much  the 
more  stable  classes  [F3,  F4,  L10] .  (Of  course,  individual  exper¬ 
ience  rating  is  a  possible  remedy  for  high  initial  automobile 
insurance  classifications.) 

The  literature  on  these  problems  is  fascinating  to  read, 
although  the  arguments  raised  on  both  sides  often  raise  more  heat 
than  light.  Insurance  professionals,  for  example,  quote  medical 
studies  on  the  effect  of  prostaglandins  on  mortality,  comment  on 
the  difficulty  of  constructing  unisex  mortality  tables,  and  speak 
direly  of  anti-selection,  plan  funding  instability,  unfair  benefit 
transfers  between  classes,  government  interference,  and  so  on. 
Lawyers  and  social  advocates  point  out  that  current  insurance 
does  not  normally  discriminate  on  a  racial  basis,  nor  include  a 
variety  of  other  obvious  "causal"  factors,  and  that  the  principle 
that  private  companies  must  be  responsive  to  social  objectives 
has  been  well  established  in  other  areas,  such  as  hiring  policy. 


AD-A088  596  CALIFORNIA  UNIV  BERKELEY  OPERATIONS  RESEARCH  CENTER  F/6  12/2 

MODELS  IN  INSURANCE:  PARADIGMS'  puzzles*  communications  and  REV— ETC  (U» 
JUN  80  VS  JEVELL  AFOSR-77-3179 

UNCLASSIFIED  ORC-80-10  NL 


accident  prevention,  product  liability,  environmental  protection, 
and  so  on. 


Rather  than  add  to  the  discussion  of  these  issues  —  I  am 
sure  you  all  have  strong  feelings  on  them  —  I  would  like  to 
return  again  to  a  point  made  at  the  beginning:  namely,  that 
model-building  is  not  only  a  tool  for  predicting  and  making 
decisions  about  natural  phenomena,  but  also  has  a  useful  function 
as  a  communications  medium  with  other  scientists  and  society  at 
large.  And,  in  this  case,  I  think  we  have  a  classical  case  of  a 
breakdown  in  communications;  neither  party  is  talking  about  the 
same  issue.  In  using  the  risk  classification  paradigm,  actuaries 
are  relying  upon  the  mean  value  principle  to  make  arguments  about 
equity;  on  the  other  hand,  societal  advocates  are  recognizing 
(however  imperfectly)  that  there  is  a  distribution  of  possible 
risk  outcomes,  and  that,  from  the  individual  point  of  view,  the 
paradigm  gives  non-socially-acceptable  results. 

Now,  I  would  admit  that  eliminating  classifications  and 
regrouping  risks  may  lead  to  higher  variances,  possible  funding 
problems,  plan  terminations,  and  other  systems  problems  which 
neither  party  wishes  to  happen.  But,  notice  that  the  analytic 
argumentation  was  never  extended  to  this  level  —  the  modelling, 
analysis,  and  argumentation  was  not  developed  to  examine  rationally 
all  of  the  possible  product  policy  changes  under  discussion,  but 
instead,  each  community  held  fast  to  its  own  narrow  world-vipw. 

As  I  said  earlier,  model-building  must  be  primarily  a  useful 
activity.  When  we  find  that  a  paradigm  no  longer  serves  our  needs, 
then  we  must  modify  it  to  suit,  or  we  will  find  that  its  function  has 
been  bypassed  by  other  forces  in  business  and  society.  It  is 
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important  to  keep  the  communication  channels  open  to  our  public , 
however  painful  it  may  feel  to  modify  or  even  abandon  our 
traditional  paradigms. 
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THE  EVOLUTIONARY  FUTURE  OF  INSURANCE  MODELLING 

In  summary,  we  have  seen  that  the  current  state-of-the-world 
,  in  general  insurance  modelling  is  complex,  characterized  on  the 
one  hand  fay  vigorous  activity  and  growth,  and  on  the  other  hand 
by  uneven  stages  of  development,  a  certain  tendency  towards 
puzzle-solving,  and  examples  of  communication  breakdowns.  Does 
this  mean  that  a  modelling  crisis,  and  a  Kuhnsian  revolution  are 
at  hand? 

For  a  variety  of  reasons,  I  believe  that  future  insurance 
modelling  will  be  evolutionary ,  not  revolutionary.  One  very 
important  reason  is  reflected  in  the  wide  range  of  communications 
offered  at  the  ICA's,  at  ASTIN  Colloquia,  and  at  your  own  national 
*  meetings;  thus,  a  variety  of  novel  methods  and  models,  often 

transposed  from  other  fields,  continues  to  be  presented  and  examined 
on  their  own  merits,  rather  than  subject  to  a  tradition-oriented 
screening  process.  This  receptiveness  to  new  ideas  is  critical  to 
the  healthy  evolution  of  a  field,  and  it  is  delightful  to  see  that 
it  is  often  the  senior  statesmen  of  insurance  who  are  actively 
trying  out  and  promoting  new  ideas. 

A  community  must  also  invest  a  portion  of  its  own  resources 
in  the  future,  which  is  why  continuing  research  activity  is  impor¬ 
tant.  Here  there  are  some  "straws  in  the  wind",  such  as  the  forma¬ 
tion  of  the  Geneva  Association  to  study  economics  of  insurance, 
and  the  new  interest  of  American  societies  in  sponsoring  research 
projects.  More  research  support  is  needed  from  industry,  in  my 
opinion. 

New  ideas  are  not  useful  unless  communicated  to  our  own 
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community  and  to  other  scientific  disciplines;  and,  here  also, 
there  are  good  signs  of  growth,  such  as  the  strengthening  and 
increased  dissemination  of  national  journals,  and  the  recent 
decision  of  Mathematical  Reviews  to  abstract  interesting  papers 
from  the  ASTIN  Bulletin  and  the  Swiss  Actuarial  Association 
Bulletin,  as  well  as  continuing  with  the  Scandinavian  Actuarial 
Journal.  In  America,  I  would  hope  to  see  ARCH  grow  into  a  national 
research  journal  which  could  transcend  the  traditional  actuarial 
boundaries,  and  encourage  contributions  from  other  scientists 
interested  in  insurance  modelling.  Internationally,  I  believe 
we  might  re-interpret  ASTIN  to  mean  simply  Actuarial  STudies  in 
INsurance ,  and  to  develop  the  membership,  the  symposia,  and  the 
Bulletin  to  include  all  those  in  our  community  and  from  other 
disciplines  who  are  interested  in  modelling  and  research. 

And  finally,  there  must  be  continuing  evolution  of  the 
educational  process ,  for  the  student  of  today  is  the  actuary  of 
tomorrow,  and  must  be  trained  in  the  concepts  and  methods  which 
will  be  useful  in  the  future .  As  indicated  earlier,  I  perceive 
a  serious  mismatch  between  the  abilities  of  today's  graduate 
and  the  demands  placed  upon  him  or  her  by  current  actuarial 
examinations  and  professional  assignments.  The  number  of  textbooks 
published  since  1969  [S6,  B7,  B37,  L9,  G13]  portends  well,  and  I 
understand  that  various  actuarial  societies  have  additional  basic 
texts  under  development;  the  national  reports  on  actuarial 
training  presented  to  this  Congress  also  reveal  some  interesting 
and  innovative  steps.  As  an  educator,  I  urge  you  to  continue  to 
devote  attention  and  resources  to  the  formation  of  young  people. 

As  for  yourselves,  I  urge  you  all  to  continue  to  be  receptive 
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of  and  tolerant  towards  new  methods,  models,  and  paradigms, 
analyzing  and  testing  them,  not  through  reaction,  but  in  terms  of 
their  potential  utility  to  the  actuarial  community  and  the  insur¬ 
ance  enterprise.  Change  is  necessary;  as  Tennyson  says  so 
beautifully: 

"The  old  order  changeth,  yielding  place  to  the  new. 

And  God  fulfils  Himself  in  many  ways, 

Lest  one  good  custom  should  corrupt  the  world." 

The  evolution  of  the  '80's  will,  1  believe,  make  it  an 
exciting  and  challenging  decade  for  insurance  modelling,  and 
I  look  forward  to  participating  in  it  with  you. 


Thank  you  for  your  kind  attention. 
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